W-fiO tZ AJ '^/Sz. 0 





Europaisches Patentamt 
European Patent Office 

Office europeen des brevets (u) Publication number : 0 541 335 A1 

EUROPEAN PATENfAPPLICATION . ^ _ _ ^ . . 

% Application number : 92310067.1 'ifjflnt. CI. 5 : C12N 1 5/62, C12N 5/10 

@ Date of filing : 04.11.92 . ^^-WF /? 6f 



Inventor : Marshall, Mark S~ 
1519 Spruce Court 

Carmel, Indiana 46032 (US) . , p ^ /j */ *~/-& 

Inventor: Hawe, Linda r ^ / <*'V 

2610 Skippack Pike 

Norristown, PA 19403 (US) 

Inventor: Montgomery, Donna L. 

9 Hickory Lane 

Chalfont, PA 18914 (US) 

Inventor: Olrff, Allen A. 

1412 Florence Drive 

Gwynedd Valley, PA 19437 (US) 

Inventor : Shi, Xiao-Ping 

536 Winthrop Road 

Collegeville, PA 19426 (US) 

Inventor: Ulmer, Jeffrey 

128 Dolly Circle 

Chalfont, PA 18914 (US) 

(74) Representative : Thompson, John Dr. et al 
Merck & Co., Inc. European Patent 
Department Terlings Park Eastwick Road 
Harlow, Essex CM20 2QR (GB) 



@ Priority: 08.11.91 US 792507 

@ Date of publication of application : 
12.05.93 Bulletin 93/19 

(G) Designated Contracting States : 
CH DE FR GB IT LI NL 

m) Applicant : MERCK & CO. INC. 

126, East Lincoln Avenue P.O. Box 2000 
Rahway New Jersey 07065-0900 (US) 

* @ Inventor : Donnelly, John J. 
1505 Brierwood Road 
Havertown, PA 19083 (US) 
Inventor : Liu, Margaret A. 
4 Cushman Road 
Rosemont, PA 19010 (US) 
Inventor : Friedman, Arthur 
121 Froghollow Road 
Churchville, PA 18966 (US) 



<g) Recombinant DNA sequences and plasmids for cellular immunity vaccines from bacterial 

toxinantigen conjugates. /^/O ^ c^Q / 



(g) Recombinant DNA sequences coding for hyb- 
rid proteins having two primary components. 
The first component is a modified bacterial 
toxin that has translocating ability, while the 
second component is a polypeptide or protein 
that is exogenous to an antigen-presenting cell. 
The hybrid has the ability to be internalized by 
an antigen-presenting cell, where the hybrid is 
subsequently processed and an antigenic seg- 
ment of the hybrid presented on the surface of 
the antigen-presenting cell, where the segment 
elicits an immune response by cytotoxic T lym- 
phocytes. 
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BACKGROUND OF THE INVENTION 

The numerous substances and organisms that threaten the existence of animals having immune systems 
are either present in extracellular body fluids, such as toxins or bacteria, or else they are harbored within the 
5 animal's own cells, such as viruses, certain parasites and oncogene products. This distinction is important to 
thymusderived lymphocytes, also known as T cells, which are an important component of vertebrate immune 
systems. T cells have evolved parallel systems for recognizing intracellular and extracellular antigens. In both 
systems, antigens are recognized only when they are bound to molecules of the major histocompatability com- 
plex (MHC). 

10 The MHC encodes two types of cell surface molecules that act as receptors for protein antigens. Class I 

MHC molecules consist of a highly polymorphic integral membrane glycoprotein alpha chain that is noncova- 
lently bound to a beta 2 microglobulin. Class II MHC molecules consist of two noncovalently bound, highly poly- 
morphic, integral membrane glycoproteins. Class I MHC molecules have a groove at the top surface formed 
by the two amino-terminal domains. The groove holds an antigen. As with other cell surface proteins, during 

15 cellular processing in the cytosol, MHC molecules are inserted into the endo-plasmic reticulum (ER) and, fol- 
lowing chain assembly, are transported to the plasma membrane-of the cell via the Golgi complex and post- 
Golgi complex vesicles. 

The recognition of Class I vs. Class II molecules as antigen-presenting sites in general divides T cells into 
two classes, respectively termed cytotoxic T cells (T c ) and helper T cells (T H ). T c cells directly lyse cells that 

20 are infected with viruses or certain parasites and also will secrete cytokines such as gamma-interferon in order 
to eradicate intracellular pathogens and tumors. 

Virtually all cell types can serve as antigen-presenting cells for T c cells as long as they express MHC Class 
I molecules. In general, T c cells require antigen-presenting cells that are actively biosynthesizing antigen. Dur- 
ing processing, the antigen is bound to a nascent Class I molecule in the ER and transported to the plasma 

25 membrane via the Golgi complex and post-Golgi complex vesicles. At the plasma membrane, the processed 
antigen sits in the groove of the MHC Class I molecule, where the processed antigen is available for binding 
to cell surface receptors of T c cells. Activation of T c cells requires interaction between multiple T c cell surface 
molecules and their respective iigands on antigen-presenting cells. Once activation has taken place, the lysing 
and cytokine secretion activity described above can begin. 

30 Antigen processing is the structural modification and trafficking, within the proper subcellular compart- 

ments, of protein antigens that enable the determinants recognized by T c cells to interact with MHC molecules. 
As noted above, most, and possibly all, somatic cells expressing MHC Class I molecules constitutively process 
antigens and transport determinants to the cell surface for T c cell recognition. Antigen processing is thus re- 
quired for the presentation of intact, folded proteins to T c cells. Commonly, antigen processing entails the gen- 

35 eration of short peptides by cellular proteases, although some intact proteins productively associate with MHC 
molecules, indicating that proteolysis is not necessarily a component of antigen processing. 

Two distinct pathways are used by cells to process antigens. The endosomal pathway is so named be- 
cause it is accessed through the endosomal compartment. Determinants produced by this pathway usually 
associate with Class II MHC molecules. The other pathway is the cytosolic pathway. The cytosolic pathway 

40 is so named because it can be accessed from the cytosol of the cell by the synthesis of proteins within the 
cell, or by penetration of plasma or endosomal membranes by extracellular proteins. Such penetration may 
occur naturally through the fusion of the cell's membrane with a virus, or artificially by osmotic lysis of antigen- 
containing pinosomes. Determinants produced by cytosolic processing typically associate with Class I MHC 
molecules. The cytosolic pathway is able to process many different types of foreign proteins for presentation 

45 to T c cells. 

Class I MHC molecules associate with antigens in a compartment of the ER. In this regard, it is important 
to note that the compound Brefeldin A acts by interfering with the normal vesicular traffic between the ER and 
the Golgi apparatus, and thus also has the effect of blocking the presentation of cytosolically processed an- 
tigen on the surface of what would otherwise be an antigen-presenting cell. 

so It can be seen from the above discussion that, in order to generate response by a cytotoxic T cell, it is 

generally necessary either to cause the target cell, which has been chosen as an antigen-presenting cell, to 
endogenously synthesize the protein antigen of interest, or to deliver exogenous protein antigen of interest 
directly into the cytosolic antigen processing pathway of the target cell. If the latter could be accomplished, a 
vaccine could be produced which would elicit cytotoxic T cells capable of killing virally or parasitically infected 

55 cells or tumor cells, thereby having particular usefulness for preventing three clinical types of diseases. 

First, such vaccines could prevent infections caused by viruses such as papilloma or herpes virus which 
do not undergo a blood-borne phase of infection. This would be especially true in the case of human papilloma 
virus E7 protein, which is continuously cellularly expressed in the transformed phenotype, and would thus be 
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particularly well suited to attack by sensitized cytotoxic T lymphocytes. 

Secondly, there are those infections caused by viruses such as influenza or human immunodeficiency 
virus (HIV) or parasites whose outer proteins may have high antigenic variability making it difficult to design 
a vaccine capable of eliciting protective titers of high affinity antibodies with broad specificity. Certain viral 

5 internal proteins have less antigenic variation, and peptides derived from such proteins when associated with 
Class I MHC molecules, would render infected cells susceptible to lysis by sensitized cytotoxic T lymphocytes. 

Thirdly, tumors and virally transformed cells express neoantigens that may be presented on Class I MHC 
molecules, thus rendering these cells suitable targets for cytotoxic T lymphocyte lysis. 

Current vaccines generally focus on generating humoral (that is, antibody) responses of the immune sys- 

10 tern, rather than the cellular immune responses discussed above. Those that do generate cellular immune re- 
sponses use attenuated live viruses which replicate intracellularly, introducing their constituents into an infect- 
ed cell's antigen processing pathway as a result of being synthesized within the cell thereby being available 
for the appropriate protein processing pathway. Thus, there is a need for a non-replicating vaccine that will 
sensitize cytotoxic T lymphocytes to produce a cellular immune response with a significantly greater margin 

15 of safety. 

The present invention meets this need by capitalizing orvthe ability of certain bacterial exotoxins to be in- 
ternalized into cells through endocytosis via receptors on the ceil surface and then translocate out of the re- 
sultant endosomes into the cellular compartment in which endogenous proteins are processed for presenta- 
tion. These exotoxins have been hybridized with polypeptide or protein antigens, which are carried into the 

20 cytoplasm and are processed to peptides capable of association with Class I MHC molecules via the physio- 
logic processes discussed above. Once associated with a Class I MHC molecule and presented on the surface 
of the antigen-presenting cell, they can sensitize cytotoxic T lymphocytes against other infected cells synthe- 
sizing the same polypeptide or protein. By virtue of these actions, the invention presents vaccines which can 
be effective in prophylaxis against viruses, parasites and malignancies. 

25 It is an additional object of the present invention to produce hybrid proteins of certain bacterial exotoxins 

having translocation domains, hybridized with polypeptides or proteins selected for their antigenic activity, 
which hybrids will be useful as probes for studying the intracellular processing and subsequent presentation 
of endogenously synthesized cytoplasmic proteins. 

30 BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 shows the structural domains of Pseudomonas exotoxin, along with the numbers of the amino 
acid residues that define the known limits of the structural domains. Amino acid residues are numbered as 
defined in Gray, et al, PNAS USA §1 = 2645-2649(1984). 
35 Figure 2 is a restriction map for plasmid pVC45-DF+T. 

Figure 3 is a restriction map for plasmid pBluescript II SK. 

Figure 4 is a restriction map for plasmid pBR322. 

Figure 5 is a graph showing the results of using hybrid construct PEMa in immunologically sensitizing U- 
2 OS cells, a human cell line. 
40 Figure 6 shows that a hybrid protein made of the binding and translocating domains of Pseudomonas exo- 

toxin and a peptide epitope of influenza A matrix protein can competitively prevent the intact Pseudomonas 
exotoxin from binding to and killing target cells. 

SUMMARY OF THE INVENTION 

45 

The invention is the recombinant DNA sequences coding for a hybrid protein of two species, the first spe- 
cies being a modified bacterial toxin that has a translocating domain. The second species is a polypeptide or 
protein. The polypeptide or protein is exogenous to an antigen-presenting cell of interest. The hybrid of the 
bacterial toxin and the exogenous polypeptide or protein are constructed in such a way as to be capable of 

so eliciting an immune response by cytotoxic T lymphocytes. Also included are suitable plasmids and methods 
of using the recombinant sequences to obtain the hybrid proteins of interest. 

A preferred bacterial toxin is a modified Pseudomonas exotoxin. Pseudomonas exotoxin is known to con- 
sist of four structural domains, namely la, II, lb and III. This is shown at Figure 1. along with the numbers of 
the amino acid residues that define the known limits of the structural domains. More preferably, the Pseudo- 

55 monas exotoxin is modified by deletion of structural domain III, that is the ADP-ribosylating structural domain, 
although alternatively domain III need not be entirely deleted, but may rather be sufficiently altered in its amino 
acid sequence so as to render it enzymatically nonfunctional as an ADP-ribosylating enzyme. Most preferably, 
the modified bacterial toxin has only a cellular recognition domain and a translocating domain, (with or without 
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the 5 C-terminal amino acids of Domain III added to the C-terminus of the polypeptide or protein antigen), or 
even just the translocating domain with or without targeting ligand. In the case of Pseudomonas exotoxin, the 
cellular recognition domain and translocating domain are known to exist within structural domains la, II and 
lb. Also most preferably, modified Pseudomonas exotoxins are arranged on the amino-terminal side of the 

5 hybrid, while the exogenous polypeptide or protein is arranged on the carboxyi-terminal side of the hybrid. 

The exogenous polypeptide or protein, which is exogenous to an antigen-presenting cell of interest, is pre- 
ferably a polypeptide or protein of viral origin. More preferably, the viral polypeptide is a viral protein fragment, 
and most preferably is taken from the group comprising the* matrix protein of influenza A virus; residues 57 to 
68 of the matrix protein of influenza A virus (the matix epitope known to bind MHC HLA-A2); the nucleoprotein 

10 of influenza A virus; or the GAG protein of human immunodeficiency virus-1 . 

Functionally, the hybrid is capable of eliciting an immune response by cytotoxic T lymphocytes, by virtue 
of being at least partially presented on an antigen-presenting cell surface. More specifically, the hybrid func- 
tionally is capable of being internalized by an antigen-presenting cell and further capable of being processed, 
via the endogenous protein processing pathway, on its way to at least partial presentation on the surface of 

15 the antigen-presenting ceil. 

The hybrid proteins preferably will use polypeptide or protein-antigens for use as a vaccine, and most pre- 
ferably will use viral antigens. Most preferably, these viral antigens will be conserved viral proteins. The hybrids 
will be incorporated in an amount sufficient to elicit an immune response by cytotoxic T lymphocytes into vac- 
cines further comprising pharmaceutical^ acceptable carriers. The vaccines will be sufficient to immunize a 

20 host against the diseases influenza, acquired immunodeficiency syndrome, human papilloma virus, cytome- 
galovirus, Epstein-Barr virus, Rota virus, and respiratory syncytial virus, tumors and parasites. 

The present invention further relates to recombinant DNAsegments containing nucleotide sequences cod- 
ing for the fused proteins described above, as well as plasmids and transformants harboring such recombinant 
DNAsegments, as well as methods of producing the hybrid proteins using such recombinant DNAsegments 

25 and methods of administration of the hybrid proteins as vaccines to hosts. 

DETAILED DESCRIPTION OF THE INVENTION 

The term "translocating domain" shall mean a sequence of amino acid residues sufficient to confer on a 
30 polypeptide or protein the ability to translocate across a cell membrane into a cellular compartment for proc- 
essing endogenous proteins. 

The term "exogenous to an antigen-presenting cell" shall mean polypeptides that are not encoded by the 
unmutated genome of a given antigen-presenting cell. 

The term "antigen-presenting cell" shall refer to a variety of cell types which carry antigen in a form that 
35 can stimulate cytotoxic T lymphocytes to an immunologic response. 

The term "immune response" shall mean those cytotoxic processes of ceil lysis and cytokine release en- 
gaged in by cytotoxic T lymphocytes that have been stimulated by antigen presented by an antigen-presenting 
cell. This term shall also include the ability of a host's cytotoxic T lymphocytes to retain their cytotoxic response 
to subsequent exposure to the same antigen that will lead to more rapid elimination of the antigen than in a 
40 non-immune state. 

The term "presented on an antigen-presenting cell surface" shall mean that process by which an antigen 
is seated within a ligand site of a major histocompatability complex Class I protein on the surface of an antigen- 
presenting cell. 

The term "being internalized by an antigen-presenting cell" shall mean the process of endocytosis resulting 
45 in endosome formation. 

The term "cellular recognition domain" shall mean a sequence of amino acid residues in a polypeptide suf- 
ficient to confer on that polypeptide the ability to recognize a receptor site on the surface of a target cell. 

The term "ADP ribosyiating domain" shall mean a sequence of amino acids sufficient to confer on a poly- 
peptide the ability to modify elongation, factor II within a cell, and thereby severly impair the viability of the 
so cell or kill it. 

The term "vaccine" shall mean a pharmaceutical^ acceptable suspension of a given therapeutic entity 
administered for the prevention, amelioration or treatment of infectious diseases. 

The term "conserved viral protein" shall mean those viral proteins that do not vary from strain to strain of 
a given species of virus, or to those viral proteins that are generally unlikely to undergo mutation as a function 
55 of time in a given strain. 

The term "arranged on the amino terminal side of said hybrid" shall mean that a peptide sequence has 
been inserted at any point between the amino terminus of a hybrid and the hybrid's middle amino acid residue. 

The term "arranged on the carboxy terminal side of said hybrid" shall mean that a peptide sequence has 
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been inserted at any point between the carboxy terminus of a hybrid and the hybrid's middle amino acid resi- 
due. 

The term "transformant" shall mean an independent, self-replicating DNA molecule, and shall include plas- 
mids. 

5 The hybrid proteins of the present invention are fusion protein constructs of a bacterial toxin having a trans- 

locating domain fused to a polypeptide or protein that has been selected for its antigenicity for a given disease, 
as well as for being exogenous to a targeted antigen-presenting cell. A preferred bacterial toxin is the Pseu- 
domonas exotoxin. This exotoxin is known to comprise four structural domains, as shown in Figure 1. These 
domains are designated la, II, lb and III. Structural domain la is known to be necessary for binding of the exo- 

10 toxin to a receptor site on the surface of a target cell. Structural domain II is known to be necessary for trans- 
location of the exotoxin across an internal membrane the targeted cell. Part of structural III are known to be 
an ADP ribosylating enzyme that bind to the protein Elongation Factor 2, which generally results in the death 
of the target cell. 

In a preferred embodiment of the present invention, structural _ d 2IZ^!'Viiljf or a,i d , omain 1,1 except for the 
15 C-terminal amino acids) ha s been deleted fr om the Pseudomonas exotoxin molecule, and has been replaced 
with one of several polypeptides or proteins chosen for their ability to act as antigens and therefore be useful 
as vaccines. The antigens used for vaccines include antigens of viruses whose hosts are higher vertebrates, 
such as antigen of influenza A virus, human immunodeficiency virus- 1, human papilloma virus, cytomegalo- 
virus, Epstein-Bar r virus, Rotavirus, and respiratory syncytial virus. Other viruses include herpes viruses such 
20 as herpes simplex virus, varicella-zoster virus, adult T cell leukemia virus, hepatitis B virus, hepatitis A virus, 
parvoviruses, papovaviruses, adenoviruses, pox viruses, reoviruses, paramyxoviruses, rhabdoviruses, are- 
na-viruses, and coronaviruses. Other disease states can have antigens designed for them and used in alter- 
native embodiments of the present invention, including antigens with pathogenic protozoa, such as malaria 
antigen. 

25 The fusion proteins of the present invention are preferably manufactured through expression of recombi- 

nant DNA sequences. 

The DNAs used in the practice of the invention may be natural or synthetic. The recombinant DNA seg- 
ments containing the nucleotide sequences coding for the embodiments of the present invention can be pre- 
pared by the following general processes: 
30 (a) A desired truncated gene is cut out from a plasmid in which it has been cloned, or the gene can be 

chemically synthesized; 

(b) An appropriate linker is added thereto as needed, followed by construction of a fused gene; and 

(c) The resulting fused protein gene is ligated down stream from a suitable promoter in an expression vector. 
Techniques for cleaving and ligating DNA as used in the invention are generally well known to those of 

35 ordinary skill in the art and are described in Molecular Cloning, A Laboratory Manual, (1989) Sambrook, J., et 
al., Cold Spring Harbor Laboratory Press. 

As the promoter used in the present invention, any promoter is usable as long as the promoter is suitable 
for expression in the host used for the gene expression. The promoters can be prepared enzymatically from 
the corresponding genes, or can be chemically synthesized. 
40 Conditions for usage of all restriction enzymes were in accordance with those of the manufacturer, includ- 

ing instructions as to buffers and temperatures. The enzymes were obtained from New England Biolabs. Be- 
thesda Research Laboratories (BRL). Boehringer Mannheim and Promega. 

Ligations of vector and insert DNAs were performed with T4 DNAIigase in 66mM Tris-HCI, 5mM MgCI 2 . 
ImMDTE, ImMATP. pH 7.5 at 15°C for up to 24 hours. In general. 1 to 200 ng of vector and 3-5x excess of insert 
45 DNA were preferred. 

Selection of E. coli containing recombinant plasmids involve streaking the bacteria onto appropriate anti- 
biotic containing LB agar plates or culturing in shaker flasks in LB liquid (Tryptone 10g/L. yeast extract 5g/L, 
NaCI 10g/L, pH 7.4) containing the appropriate antibiotic for selection when required. Choice of antibiotic for 
selection is determined by the resistance markers present on a given plasmid or vector. Preferably, vectors 
so are selected by ampicillin. 

Culturing of E. coli involves growing in Erlenmeyer flasks in LB supplemented with the appropriate anti- 
biotic for selection in an incubation shaker at 250-300 rpm and 37°C. Other temperature from 25°-37°C could 
be utilized. When cells are grown for protein production, they are induced at As^l with IPTG to a final con- 
centration of 0.4 mM. Other cell densities in log phase growth can alternatively be chosen for induction. 
55 Harvesting involves recovery of E. coli cells by centrifugation. For protein production, cells are harvested 

3 hours after induction though, other times of harvesting could be chosen. 

In the present invention, any vector, such as a plasmid, may be used as long as it can be replicated in a 
procaryotic or eucaryotic cell as a host. 
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By using the vector containing the recombinant DNA thus constructed, the host cell is transformed via 
the introduction of the vector DNA. 

The host cell of choice is BL21 (DE3) cells (E. coli) , obtained from F. Wm. Studier, Brookhaven National 
Laboratories, Stony Brook, N.Y. Reference is also made to Wood, J. Mol. Biol., 16:118-133 (1966) U.S. Patent 
No. 4,952.496, and Studier, et al. t J. Mol. Biol. 189:113-130 (1986). However, any strain of E. coli containing 
an IPTG inducible T7 polymerase gene would be suitable. For routine cloning, E. coli strain DH5a(BRL) can 
be used. 

BL21(DE3) strain of E. coli was acquired under license "from W. F. Studier. Reference is made to Studier, 
W. F. et. al.. Methods in Enzymology, Vol. 185, Ch. 6, pp 60-89 (1990). This strain is unique to the extent that 
it contains an inducible T7 polymerase gene. The strain has no amino acid, sugar or vitamin markers, so it can 
grow on any rich or defined bacterial medium. It can be grown between 25°C and 37°C. It needs aeration, and 
it needs IPTG for induction of the T7 polymerase. 

In the present invention, the fused proteins can be separated and purified by appropriate combinations of 
well-known separating and purifying methods. These methods include methods utilizing a solubility differential 
such as salt precipitation and solvent precipitation, methods mainly utilizing a difference in molecular weight 
such as dialysis, ultrafiltration, gel filtration and SDS-polyacryiamide gel electrophoresis, methods utilizing a 
difference in electric charge such as ion-exchange column chromatography, methods utilizing specific affinity 
such as affinity chromatography, methods utilizing a difference in hydrophobicity such as reverse-phase high 
pressure liquid chromatography, methods utilizing a difference in isoelectric point, such as isoelectrofusing 
electrophoresis, and methods using denaturation and reduction and renaturation and oxidation. 

Preferred embodiments of the invention will now be described in detail in the following non-limiting exam- 
ples. The most preferred embodiments of the invention are any or all of those specifically set forth in these 
examples. These examples are not, however, to be construed as forming the only genus that is considered 
as the invention, and any combination or sub-combination of the examples may themselves form a genus. 
These examples further illustrate details for the preparation of various embodiments of the present invention. 
Those skilled in the art will readily understand that known variations of the conditions and processes of the 
following preparative procedures can be used to prepare these embodiments. 



EXAMPLE 1 



BS-PEMI-2 

A 1.3kb Nrul/Sacll fragment of plasmid pVC45-DF+T (Fig. 2) (obtained from Dr. Ira Pastan of the National 
Institute of Health) containing the domain I and II coding regions of Pseudomonas exotoxin (PE) (Sequence 
ID No. 1 ) was subcloned into pBluescript II SK (Stratagene, Fig. 3) restricted with Hindi and Sacll. The resulting 
construct is designated BS-PE. The influenza Ml (Ml) gene (Sequence ID No. 2 and 3) which codes for the 
matrix protein of influenza A virus was subcloned into BS-PE restricted with Sacll and Sad by amplifying the 
Ml gene from pApr701 (P. Palase, Mt. Sinai Medical Center, New York, N.Y. pApr 701 consists of the Ml gene 
cloned into the ECORI site of pBR322, shown at Fig. 4. Reference is made to Young, J.F. et al, Expression of 
Influenza Virus Genes; The Origin of Pandemic Influenza Virus; 1983) by polymerase chain reaction (PCR) 
(Gene Amp® PCR Reagent Kit; Perkin Elmer Cetus, Norwalk. Conn. 06859) with oligonucleotide primers which 
added a Sacll site adjacent to Ml codon number 2 (Sequence ID No. 4) and a Sad site 3' of the Ml termination 
codon (Sequence ID No. 5). This plasmid is designated BS-PEMI-1. 

The truncated ompA leader coding sequence was removed from the 5' end of the fusion gene by replacing 
the small Xhol/Hindlll fragment of BS-PEMI-1 with the oligonucleotide sequence shown in Sequence ID No. 
6. The resulting plasmid is named BS-PEMI-2 and encodes a fusion gene consisting of Pseudomonas exotoxin 
amino acids 2 through 414 joined to Ml amino acids 2 to 252 (Sequence ID No. 7 and 8). 



EXAMPLE 2 



pVC-ompA-PEMI-2 

pVC45DF+T vector was prepared by restriction digestion with Hindlll and EcoRI, followed by gel purifica- 
tion. 

The PEMI insert fragment was prepared by restriction digestion of BS-PEMI-1 with Sad, followed by T4 
DNA polymerase treatment to remove the 3' overhang. EcoRI linkers were added to the blunted Sad site, fol- 
lowed by restriction digestion with Hindlll. The Hindlll-EcoRI fragment was gel purified (Molecular Cloning Man- 
ual, Gene Clean Kit, Bio 101, Inc. P.O. Box 2284, La Jolla, CA 92038) and ligated into the prepared pVC45- 
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DF+T vector. The resulting construct was named pVC-ompA-PEMI-2. 

The ompA signal sequence was removed from the construct by restriction digestion of pVC-ompA-PEMI- 
2 with Xbal and Hindlll. An oligonucleotide fragment containing the T7 promoter, ribosome binding site and 
initiation sequence was ligated into the vector whose base sequence is shown at Sequence ID No. 9. The re- 
5 suiting plasmid construct was named pVC-PEMI-2 and encodes a T7 polymerase-driven gene fusion consisting 
of PE amino acids 2 through 414 joined to influenza Ml amino acids 2 through 252. The 5' and 3' ends of the 
coding region, as well as the PE to MI fusion site and cytotoxic T lymphocyte epitope coding sequences 
(Rotzschke, O. et. al.. Nature 348, 252 (1990) were confirmed by DNA sequencing. 

10 EXAMPLE 3 

BS-PEMa 

The influenza Ma sequence (coding for residues 57-68 of the influenza matrix protein) was obtained by 
15 amplifying a portion of the influenza M1 gene in pApr701 by polymerase chain reaction (PCR) with oligonu- 
cleotide primers which added a Sacll site adjacent to influenza Ml codon No. 57 (Sequence ID No. 10) and a 
termination codon and a Sad site 3' of the Ml codon No. 68 (Sequence ID No. 11). This fragment was cut with 
Sacll and Sacl and subdoned into BS-PE digested with Sacll and Sacl. The resulting plasmid is named BS- 
PEMa-1 and was verified by sequencing through the junctions and the Ma sequence itself. 

20 

EXAMPLE 4 

Subcloning of PEMa from BS-PEMal into PVC45DF+T 

25 The PEMa insert (Sequence ID No. 12) was prepared by restricting BS-PEMa-1 with Sacl and removing 

the 3' overhang by treatment with T4 DNA polymerase, then restricting with Apal and gel purifying. 

pVC45DF + T was restricted with EcoRI and the 5' overhang filled in with Klenow enzyme treatment (Mo- 
lecular Cloning Manual, ibid.). It was subsequently restricted with Apal and gel purified. The vector and frag- 
ment were ligated together, and the resulting construction was named pVC-ompA-PEMa-1. The construction 

30 was verified by sequencing across the junctions and through Ma. 

The ompA leader sequence was removed from pVC-ompA-PEMa-1 by digestion with Xbal and Hindlll. An 
oligonucleotide fragment containing the T7 promoter, ribosome binding site, initiation sequence and a build- 
back of the 5' end of the PE coding region (Sequence ID No. 13) was ligated to the vector. The resulting con- 
struction was named pVC-PEMa-1 and encodes a T7 polymerase driven gene fusion consisting of PE amino 

35 acids 2 to 414 joined to influenza Ml amino acids 57 to 68 (Ma) Sequence ID No. 14 and 15. The 5' end of 
pVC-PEMa-1 was verified by sequencing through the oligonucleotide fragment 

EXAMPLE 5 

40 Construction of pVC-PEBT 

A control plasmid was constructed which encodes a T7 polymerase driven gene fusion consisting of PE 
amino acids 2 to 414 followed by termination codons. pVC-PEMI-2 was digested with Sacll and EcoRI to re- 
move the Ml sequence. The vector was gel purified and ligated to an oligonucleotide that builds back PE codon 
45 No. 414 followed by termination signals shown in Sequence ID No. 16. The resulting construction was named 
pVC-PEBT (Sequence ID No. 17 and 18) and was verified by sequencing across the junctions and the oligo- 
nucleotide addition. 

EXAMPLE 6 

50 

BSK-PEMI 

BSK-PEMI was made from BS-PEMI by the replacement of the 21 base pair Xhol/Hindlll fragment with a 
24 base pair fragment encoding a consensus eucaryotic ribosome binding site (Sequence ID No. 1 9). The pur- 
55 pose of the construct was to increase the yields of in vitro translated PEMI protein. Thus, an additional object 
of the invention is to increase yields of translated PEMI protein. 
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EXAMPLE 7 

pVCPE/2 (pVC45DF+T/2) 

5 pVCPE/2 was made by replacing the 1 05 base pair PpuMI/EcoRI fragment of pVC45DF+T with a 46 base 

pair DNA fragment encoding an inframe duplication of PE codons 604 to 613 flanked by unique cloning sites 
(Sequence ID No. 20). This construct is used for generating full-length molecules of PE with the deletion of 
residue 553 resulting in an inactivated toxin domain (sequence ID No. 21 and 22) fused to protein segments 
of choice between PE codons 604 and 605. One may replace the ompA signal sequence with the promoter/ri- 

10 bosome binding site as described for PVC-PEMI-2. 

EXAMPLE 8 
pVCPE/2-Ma 

15 

pVCPE/2-Ma was made by ligating into the Xmal site of pVCRE/2 a 48 base pair DNA fragment encoding 
amino acids 55 through 67 (Sequence ID No. 23). This construct expresses in E. coli full-length PE with Ml 
amino acids 55 through 67 inserted between PE amino acid 604 and 605 (Sequence ID No. 24 and 25). One 
may replace the ompA signal sequence with the promoter/ribosome binding site as described for pVC-PEMI- 
20 2. 

EXAMPLE 9 

pVCPE/2-MI:15-106 

25 

pVCPE/2-MI:15-106 was made by subcloning a PCR-amplified DNA fragment encoding Ml amino acids 
15 through 106 into the Xmal site of pVCPE/2. The sequence of the oligonucleotide primers used to amplify 
the Ml segment are those shown at Sequence ID No. 26 and 27, respectively. This construct expresses in E. 
coli full length PE with Ml amino acids 1 5 through 1 06 inserted between PE amino acid 604 and 605 (Sequence 
30 ID No. 2& and 29). One may replace the ompA signal sequence with the promoter/ribosome binding site as 
described for pVC-PEMI-2. 

EXAMPLE 10 

35 pVCPEdel(403-613) 

pVCPEdel(403-61 3) was made by restricting pVC45DF+T with Sacll followed by elimination of the 3' Sacll 
overhang with T4 DNA polymerase and the ligation of a 3-frame termination linker whose nucleic acid se- 
quence is given at Sequence ID No. 30. This construct will express PE domains I, II and lb only, fused to the 
40 ompA leader in E. coli. 

EXAMPLE 11 

pVCPEdel(403-505) 

45 

pVCPEdel(403-505) was made by restricting pVC45DF+T with Sacll and Xhol followed by removal of re- 
striction overhangs with mung bean nuclease (New England Biolabs). The vector fragment was recovered and 
reclosed with DNA lipase. This construct will express in E. coli the PE protein lacking amino acids 403 through 
505. 

50 

EXAMPLE 12 
pVCPEdei(494-505) 

55 pVCPEdel(494-505) was made by restricting pVC45DF+T with BamHI and Xhol followed by the filling in 

of the 5' overhangs with Kienow fragment. The vector fragment was recovered and reclosed with DNAIigase. 
This construct will express in E. coli the PE protein lacking amino acids 494 through 505. 
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EXAMPLE 13 
pVCPEdel(494-610) 

5 pVCPEdel(494-610) was made by restricting PVC45DF+T with BamHI and PpuMI followed by the filling 

in of the 5' overhangs with Klenow fragment. The vector fragment was recovered and reclosed with DNAIigase. 
This construct will express in E. coli the PE protein lacking amino acids 494 through 610. All of the pVCPEdel 
plasmids were useful in determining to what extent the toxin domain of PE could be truncated without resulting 
in the expression of an insoluble protein in E. coli . It thus became an additional object of the invention to provide 

10 hybrids having the minimal toxin domain of PE that would retain water solubility. 

EXAMPLE 14 

Addition of Sequences Between pE and Ml in pVC-PEMI-2 

15 

Oligonucleotide linkers can be added at the Sacll site between PE.and MI in pVC-PEM-2. These linkers 
can be designed to add cleavage sites and/or signal sequences which can help the Ml portion of the fusion 
protein to become available for presentation within the cell. Sacll digestion cleaves the gene between the last 
two PE codons (for amino acids 413 and 414) and provides an appropriate site for such additions. 
20 The following four constructions have been made by inserting linkers at the Sacll site. The constructions 

have been verified by sequencing across the Sacll junctions and through the complete linker. 

EXAMPLE 15 

25 pVC-PE-RK-MI 

This vector contains an ARG LYS(RK) cleavage site inserted into the Sacll site, using an oligonucleotide 
linker as shown in Sequence ID No 31. The resulting amino acid sequence between amino acids 413 and 414 
of PE is Gly Gly Arg Lys Ser. 

30 

EXAMPLE 16 
pVC-PE-RKSigl-MI 

35 This vector contains an ARG LYS(RK) cleavage site and the signal sequence that is shown in Sequence 

ID No. 32 from the Influenza A hemagglutinin (HA) protein inserted at the Sacll site, using the oligonucleotide 
linker disclosed at Sequence ID No. 33. The resulting amino acid sequence between amino acids 413 and 414 
of PE is also as shown in Sequence ID No. 34. 

40 EXAMPLE 17 

PVC-PE-Sig1-MI 

This vector contains the signal sequence of HA without the RK cleavage site inserted into the Sacll site 
45 using the oligonucleotide linker shown at Sequence ID No. 35. The resulting amino acid sequence between 
amino acids 413 and 414 of PE is also as shown at Sequence ID No. 36. 

EXAMPLE 18 

so pVC-PE-Sig2-MI 

This vector contains the signal sequence shown at Sequence ID No. 37, derived from amino acids 22 to 
48 from ovalbumin inserted into the Sacll site, using the oligonucleotide linker of Sequence ID No. 38. The 
resulting amino acid sequence between amino acids 413 and 414 of PE is also as that shown in Sequence ID 
55 No. 39. 
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Addition of Sequences Between 
PE and Ma In pVC-PEMa-1 

5 Oligonucleotide linkers can be added at the Sacll site between PE and Ma in pVC-PEMa-1. These linkers 

can be designed to add cleavage sites and/or signal sequences which can help the Ma peptide to become avail- 
able for presentation within the cell. Sacll digestion cleaves the gene between the last two PE codons (for 
amino acids 413 and 414) and thus provides an appropriate site for such additions. 

The following four examples have been made by inserting linkers at the Sacll site. The constructions have 

10 been verified by sequencing across the Sacll junctions and through the complete linker. 

EXAMPLE 19 
pVC-PE-RKSig1-Ma 

15 

This vector contains an ARG LYS (RK) cleavage site and the signal sequence from the Influenza A he- 
magglutinin (HA) protein inserted into a blunted Sacll site, using the oligonucleotide linker shown at Sequence 
ID No. 40. The resulting amino acid sequence between amino acids 413 and 414 of PE exotoxin is also as 
shown at Sequence ID No. 41. 

20 

EXAMPLE 20 
pVC-PE-Sig1-Ma 

25 This vector contains the single sequence of HA without a cleavage site inserted into a blunted Sacll site 

using the oligonucleotide linkers shown in Sequence ID No. 42. The resulting amino acid sequence between 
amino acids 413 and 414 of PE is also as shown in Sequence ID No. 43. 

EXAMPLE 21 

30 

pVC-PE-Sig2-Ma 

This vector contains a signal sequence derived from amino acids 22 through 48 from ovalbumin inserted 
into a blunted Sacll site, using the oligonucleotide linker as seen in Sequence ID No. 44. The resulting amino 
35 acid sequence between amino acids 413 and 414 of PE is also as shown in Sequence ID No. 45. 

EXAMPLE 22 
pVC-PE-Sig1Sig2-MA 

40 

This vector contains the signal sequence derived from HA, followed by the signal sequence from ovalbumin 
inserted into the Sacll site, using the oligonucleotide linker shown at Sequence ID No. 46. The resulting amino 
acid sequence between amino acids 413 and 414 of PE is also as shown at Sequence ID No. 47. 

45 EXAMPLE 23 

BSPEMIc5aa 

The plasmid BSPEMI-2 was digested with Sad and Stul and ligated to the oligonucleotide linker shown 
so at Sequence No. 48. This linker builds back the C-terminus of the Ml protein and adds the last five amino acids 
from the C-terminus of the PE protein, whose sequence is Arg Glu Asp Leu Lys, followed by a termination co- 
don. This also incorporates an EcoRI site. The resulting plasmid was named BSPEMicSaa and was sequenced 
across the junctions (Sequence ID No. 49 and 50) and the linker for verification of the construction. 

55 
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EXAMPLE 24 
pVC-PEMIc5aa 

5 The plasmid BSPEMlcSaa was digested with Hindlll and EcoRI and 1.8 kb PEMlcSaa fragment was gel 

purified. The plasmid pVC-PEMI-2 was digested with Hindlll and EcoRI and the 3.2 kb vector fragment was 
ligated to the 1.8 kb PEMlcSaa fragment and the resulting plasmid was named pVC-PEMIc5aa. The 5' and 3' 
ends of the PEMlcSaa insert were verified by sequencing. 

10 EXAMPLE 25 

pVC-PENPc5aa 

Afragment containing the nucleoprotein (NP) of Influenza A virus was obtained from plasmid pApr501 (ob- 
15 tained from Peter Palase, Mt. Sinai Medical Center, New York, N.Y. pAprSOl is said nucleoprotein gene cloned 
into the EcoRI site of pBR322,.(Fig. 4) by polymerase chain reaction with oligonucleotide primers which added 
a Sacll site adjacent to the ATG codon of NP to give the sequence shown at Sequence ID No. 51 , and the last 
5 amino acids of PE followed by a termination codon and an EcoRI site to the 3' end of NP to give the sequence 
shown at Sequence ID No. 52. The polymerase chain reaction fragment was digested with Sacll and EcoRI 
20 and ligated to the plasmid pVC-PEMI-2 digested with Sacll and EcoRI. The resulting plasmid is named pVC- 
PENPcSaa. The 5' and 3' ends of the PENPc5aa insert (Sequence ID No. 53 and 54) were verified by sequenc- 
ing. This construction fuses the binding and translocation domains of PE to the Influenza A nucleoprotein. 

EXAMPLE 26 

25 

pVC-ompA-PEGAG 

The HIV GAG gene was obtained from plasmid HIVpBR322 (obtained from Ron Diehl Merck, Sharpe and 
Dohme Research Laboratories, West Point, PA., Fig. 5) by polymerase chain reaction with oligonucleotides 

30 that added a Sacll site adjacent to the ATG codon of GAG to give the nucleotide sequence shown at Sequence 
ID No. 55, and a Sad site immediately after the termination codon at the 3' end to give the nucleotide sequence 
at Sequence ID No. 56. The polymerase chain reaction fragment was digested with Sacll and ligated to plasmid 
pVC45DF+T, which had been digested with EcoRI, the 5' overhang filled in by Klenow fragment, and digested 
with Sacll. The resulting plasmid was named pVC-ompA-PEGAG (Sequence ID No. 57 and 58) and was veri- 

35 f ied by a partial sequence at the Sacll junction. 

This construction fused the binding and translocation domains of PE to the GAG gene of HIV-1 virus. The fusion 
protein contains an ompA leader sequence. Alternatively, any vector containing the complete coding region 
for HIV GAG can be used with these oligomers to generate the HIV GAG gene by PCR. 

40 EXAMPLE 27 

Expression of PEMI, PEMa and PEBT 

Frozen competent BL21(DE3) cells (as described by Studier, et al. Mol. Biol., 189. 113-130, 1986) were 
45 prepared as described (DNA cloning. Vol. 1, p. 121. Ed. D N Glover, IRL Press, Wash., D.C.). 

BL21(DE3) cells were transformed with pVC-PEMI-2, pVC-PEMa-1, or pVC-PEBT as described below 
(this can be performed with pVC-PE fusion plasmids in general) and transformants were selected on L-Amp 
plates. Fresh transformants were used to inoculate L-Amp liquid cultures at A560=0.1. Cultures were grown 
at 37°C with vigorous aeration and induced at A560=1 .0 with IPTG to a final concentration of 0.4 mM. Cultures 
so were harvested after 3 hours of induction and the cell pellets used for protein extraction and purification (Pro- 
tein Structure: A Practical Approach, T.E. Creighton, ed., IRL Press at Oxford Univ. Press, Ch. 9, 191 (1989)). 

Transformation Procedure 

55 A bath of dry ice/ethanol was prepared and maintained at -70°C. Competent cells were removed from a - 

70°C freezer and thawed on ice. Asuff icient number of 17 x 100 mm polypropylene tubes (Falcon 2059) were 
placed on ice. 100 ^1 aliquots of gently mixed cells were prepared in the chilled polypropylene tubes. DNA was 
added by moving a pipette through the cells while dispensing; the cells were then gently shaken for 5 seconds 
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after addition. The cells were incubated on ice for 30 minutes, then heat-shocked in a 42°C water bath for 45 
seconds without shaking. The cells were again placed on ice for 2 minutes. 0.9 ml of S.O.C. reagent (Bacto- 
tryptone 2%, Yeast Extract 0.5%, NaCI 10mM t KCI 2.5mM, MgCI 2 MgS0 4 20mM, Glucose 20mM and distilled 
water, up to 100 ml) was added and the mixture shaken for 1 hour at 225 rpm and 37°C, then plated on antibiotic 
5 plates, spread gently. 



EXAMPLE 28 



Incubation of U-2 OS Cells With 51 Cr and Protein/PEMa 

10 

U-2 OS cells (ATCC) were harvested from flasks, after a 1X wash with RCM 8, using 1mM EDTA The 
flasks were incubated at 37°C for 10 minutes, until cells were nonadherent. Five ml. of U-2 OS medium 
[McCoy's 5A(GIBCO) supplemented with 15% fetal bovine serum (HyClone) and penicillin 100 U/ml and strep- 
tomycin 100 ng/ml (GIBCO)] was added, and the cells were centrifuged for 10 minutes at 210 x g. 

15 Cells were resuspended in U-2 OS medium at 8.5 x lOs/ml. To each well of a 12-well plate, 0.7 ml of cell 

suspension was added. Negative controls include U-2 OS medium alone and PEBT. The positive control for 
sensitization of U-2 OS cells is KKAMI (2 ng/ml) f from M. Gammon and H. Zweerink (Merck, Sharp and Dohme 
Research Laboratories, Rahway, NJ). PEMa was added at 0.2|iM or greater well concentration. Simultaneous- 
ly, 137.5 jiCi of 51 Cr (Amersham) was added to each well. Medium was added to all wells to bring the total vol- 

20 ume to 1 ml. This was placed at 37°C, 5.5% C0 2 for 14 hours. 

EXAMPLE 29 



Assay Protocol for CTL Activity Against Sensitized U-2 OS Targets 



25 



After the 14 hour incubation, U-2 OS were removed, after a 1X RCM 8 wash using 1mM EDTA. Plates 
were incubated at 37°C for 10 minutes until cells were nonadherent. K medium [RPMI 1640 (GIBCO) supple- 
mented with 1 0% fetal bovine serum (HyClone), 10 mM HEPES (GIBCO), 2 mM L-glutamine (GIBCO), penicillin 
100 U/ml and streptomycin 100 ^g/ml (GIBCO), and 50 urn 2-mercaptoethanol (Bio-Rad)] was added to give 

30 a total volume of 10 ml; cells were centrifuged for 10 minutes at 210 x g. The cells were incubated at room 
temperature for 10 minutes in 10 ml of K medium before entering the second centrifugation. The cells were 
then resuspended in 1 ml of K medium, counted, and resuspended to 1 x 10 5 /ml in K medium. 

Human cytotoxic T lymphocytes, generated from one donor, were harvested, centrifuged for 10 minutes 
at 92 x g, and resuspended in K medium at 2.5 x 10 6 /ml. 

35 100 n-l of human CTLs were added to each well of a 96-well U-bottom microtiter plate (CoStar). 100 \x\ of 

the U-2 OS 51 Cr-labeled targets were also added to these wells for a final effector/target ratio of 25:1. Spon- 
taneous 51 Cr release was determined by incubating U-2 OS cells with 100 nl of K medium alone. The maximal 
release was determined by adding 100 |xl of 6 M HCI to 100 \x\ of targets. The plates were quickly centrifuged 
to bring down the cells, and incubated for 2 hours at 37°C. 

40 After this 2 hour incubation, the plates were centrifuged for 5 minutes, 330 x g, 5°C; 30 ^il of supernatant 

was harvested from each well onto a plastic- backed filter mat (Pharmacia/LKB). The mat was dried in the mi- 
crowave for 3 minutes, on medium-high power. The mat was placed into a sample bag with 10 ml of BetaPlate 
Scint, heat sealed and placed into the BetaPlate 1205 counter (Pharmacia/LKB). Results were expressed as 
% specific lysis, defined as: 

45 % specific lysis = Experimental - Spontaneous x 10Q 

Maximal - Spontaneous 

where 



Experimental = counts per minute from the 30 \x\ of supernatant harvested from the wells containing targets 
so plus human cytotoxic T lymphocytes, as determined by a BetaPlate 1205 counter; 

Spontaneous = counts per minute from the 30 \i\ of supernatant harvested from the wells containing targets 
plus medium alone, as determined by the BetaPlate 1205 counter; and 

Maximal - counts per minute from the 30 nl of supernatant harvested from the wells containing target plus 6M 
HCI (Fisher Scientific), as determined by the BetaPlate 1205 counter. 
55 Results are presented graphically in Fig. 5, with U-2 OS medium alone and PEBT as negative controls, 

and KKAMI as a positive control. Greater that 10% specific lysis is considered a positive response (Cerottini, 
et.al. f J. Exp. Med. 140:703, 1974). 
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EXAMPLE 30 

Generation of Ml-specif ic Human Cytotoxic T Lymphocytes 

5 Original stock of human cytotoxic T lymphocytes was derived by harvesting blood from one donor into a 

syringe (Becton Dickinson) containing 25 U of heparin for each ml of whole blood (Elkins-Sinn, Inc.). The hep- 
arinized blood was pipetted directly into a Leucoprep tube (Becton Dickinson) and centrifuged for 20 minutes 
at 1700 X g. The buff y coat which was seen just above the interface was removed, centrifuged for 1 0 minutes 
at 92 X g, and washed twice in RPM1 1640 (GIBCO). The peripheral blood mononuclear cells (PBLs) recovered 

10 from the Leucoprep procedure were resuspended in 10 ml of CTL medium [RPM1 1640 (GIBCO) supplemented 
with 10% donor or pooled human plasma. 4 mM L-glutamine, 10 mM HEPES, penicillin 100 U/ml and strepto- 
mycin 100 ng/ml (GIBCO)] at 1 X 10 6 /ml. 

Ml peptide (received from M. Gammon and H. Zweerink, MSDRL, Rahway; 2 mg/mi stock) in DMSO was 
diluted 1:10 in RPM1 1640 (GIBCO). Ml peptide was added to the 10 ml of lymphocytes at a final concentration 

15 of 5 ng/ml. The cells were then plated at 1.5 X 10 6 /well in 24-well plates (Nunc). 

Two U/ml of lnterleukin-2 ala-1 25 (Amgen) was added on Day 3. The cell density was adjusted to 1 X 1 0 6 /ml 
as needed, and the medium was supplemented with 2 U/ml additional lnterleukin-2 to compensate for the in- 
crease in volume. Cells were restimulated with peptide-pulsed peripheral blood lymphocytes every 7 days as 
described below, lnterleukin-2 ala-1 25 (Amgen) was replenished every 3 days. 

20 Cytotoxic T lymphocytes and unstimulated PBLs were frozen (CryoMed) in a mixture of 70% RPMI 1640 

(GIBCO), 20% fetal bovine serum (HyClone), and 10% dimethyl sulfoxide (Sigma) and thawed as needed. 

EXAMPLE 31 

25 Recovery and Restimulation of Frozen CTL's 

Cytotoxic T lymphocytes (CTUs) were thawed in a 37° water bath and then resuspended in 35 ml of CTL 
medium [RPMI 1640 (GIBCO) supplemented with 10% donor or pooled human plasma, 4 mM L-glutamine, 10 
mM HEPES. penicillin 100 U/ml and streptomycin 100 ug/ml (GIBCO]. The cytotoxic T lymphocytes were then 

30 placed at 37°, 5% C0 2 for 1 hour. The cell suspension was centrifuged for 1 0 minutes at 92 X g. The cells were 
resuspended at 5 X 10 5 /ml in CTL medium. 

The source of stimulator cells for the freshly thawed cytotoxic T lymphocytes was freshly harvested PBL, 
which had been collected using the Leucoprep method described above. For peptide pulsing, an appropriate 
number (2 x 10 6 - 10 7 ) of PBL were centrifuged, the supernatant was aspirated, and KKAMI at 200 u,g/ml in 

35 RPMI 1640 (GIBCO) plus 10% DMSO (Sigma) was added at the rate of 100 uJ of KKAMI for every 10 7 cells. 
The cells were incubated for 1 hour at 37°, 5% C0 2 . The peptide-pulsed peripheral blood lymphocytes were 
irradiated with 2,000 Rads using a *°Co source. The cells were washed once in RPM1 1640, centrifuged'for 10 
minutes at 92 X g, and resuspended in CTL medium at 1 X 10 6 /ml. 

Equal volumes of cytotoxic T lymphocytes and irradiated, peptide-pulsed peripheral blood lympocytes were 

40 mixed together for a final ratio of 1 CTL:2 peptide-pulsed PBL. lnterleukin-2 ala-1 25 (Amgen) was added at a 
final concentration of 2 U/ml. The cells were thoroughly mixed together with the lnterleukin-2 ala-125 and 1 .2 
ml was plated into each well of a 48-well plate (CoStar). 

The cells were counted and lnterleukin-2 ala-125 was replenished every 3 days. This was achieved by 
pooling alt the wells into a centrifuge tube, counting the cells in a hemocytometer counting chamber, adjusting 

45 the cells to 1 X 10 6 /ml with CTL medium, and adding 2 U/ml of lnterleukin-2 ala-125. Then 1.5 X 10 6 cytotoxic 
T lymphocytes in 1 .5 ml of CTL medium with lnterleukin-2 ala-125 were plated into each well of a 24-well plate 
(CoStar). the restimulation process was repeated every seven days, at which time frozen PBUs were then used 
as the source of stimulators. 

so Example 32 

Binding of PEMa to the PE receptor 

PEMa was used in a binding/competition assay to compete with PE for the PE receptor on U-2 OS cells. 
55 In doing so, PEMa was shown in Figure 6 to protect the cells from the toxic effects of PE. Therefore, replace- 
ment of the toxin domain of PE with the Influenza matrix peptide (amino acids 57-68) did not prohibit the binding 
of this chimeric protein to the PE receptor. This suggests that the ability of PEMa to sensitize target cells for 
lysis by CTLs specif ic for the matrix peptide is mediated through PE receptor-mediated uptake and processing. 
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U-2 cells were grown to a density of 20,000 ceils/1 00^1 in 960 well plates. Cells were preincubated with 
PEMA (0,0.1, 1, 10 and 50 u,g in 100 ^ of complete McCoy* s 5A medium) for 30 minutes at 37°C, followed by 
incubation with or without PE(10 ng) for 2 minutes. This represents a 0-, 10-, 100-, 1000-, and 5000-fold excess 
of PEMA over PE, respectively. Cells were washed with McCoy's medium (3 x 200 uJ); then incubated with 

5 P 5 S]methionine (2 ^iCi/100 uJ) for an additional 5 hours at 37°C and washed (3 x 200 u.1). Cells were lysed in 
10mM EDTA(100 u.l)and aliquots (5 were spotted onto Whatman 3MM filters. Incorporation of radioactivity 
was assayed by TCA precipitation of the cellular proteins onto the filter papers by immersion into ice-cold TCA 
(1 0% w/v) for at least 1 hour. Filters were washed once with" 5% TCA and 3 times with ethanol and dried. Ra- 
dioactivity was determined by liquid scintillation counting. Incorporation of pssjmethionine into the TCA-pre- 

10 ci pi table pool of cellular proteins in the absence (open circles) or presence (closed circles) of PE is shown as 
a function of lop excess PEMa. Error bars represent +/-SEM for n=9. Using a one-tailed t-test, incorporation 
of pssjmethionine was determined to be significantly lower in the presence of PE than in the absence of PE 
at 0-, 10-, and 100-fold excesses of PEMa (99.5%, 99.5% and 95% confidence limits, respectively). However, 
at 1000- and 5000-fold excesses of PEMa, incorporation was not significantly different in the presence or ab- 

15 sence of PE. 

Following preparation of the protein hybrids of the present invention, a suspension of the protein-hybrids 
suitable for injection into the host animal must be prepared. Typical suspension vehicles include sterile saline 
and sterile water for injection. Various agents may be added as preservatives including benzethonium chloride 
(0.0025%), phenol (0.5%), thiomersal (1:10,000). Strength of the vaccine will be measured as mass of fusion 
20 protein which generates a protective response, defined by in vitro/in vivo results, per given host species, a 
method known to those of ordinary skill in the art 

The suspensions for injection must, of course, be prepared under sterile conditions, in which there is a 
total absence of living organisms and absolute freedom from biological contamination present in the suspen- 
sion for injection. 

25 Although water is always the solvent of choice for an injectable preparation, co-solvents that may be ad- 

ditionally present include ethyl alcohol, glycerin, propylene glycol, polyethylene glycol and dimethylacetamide. 
Buffers may be added, including acidic acid, citric acid or phosphoric acid systems. Antioxidants can include 
ascorbic acid, BHA, BHT, sodium bisulfite, and sodium metabisulf ite. Tonicity can be adjusted with agents such 
as dextrose, sodium chloride and sodium sulfate. 

30 Aseptic manufacture of vaccines, including their packaging, is conducted according to methods well known 

to those of ordinary skill in the art, and as described in standard texts on the subject, including Lachman, L. f 
et al., The Theory And Practice of Industrial Pharmacy , Dittert, L., ed, Sprowl's American Pharmacy ; and Re- 
mington's Pharmaceutical Sciences. 

While the invention has been described and illustrated in reference to certain preferred embodiments 

35 thereof, those skilled in the art will appreciate that various changes, modifications and substitutions can be 
made therein without departing from the spirit and scope of the invention. It is intended, therefore, that the 
invention be limited only by the scope of the claims which follow, and that such claims be interpreted as broadly 
as is reasonable. 

40 

SEQUENCE LISTING 



45 



50 



55 
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(2) INFORMATION FOR SEQ 10 N0:1: 

(i) SEQUENCE CHARACTERISTICS: 
5 (A) LENGTH: 1294 base pairs 

(B) TYPE: nucleic acid 

(C) STRANOEONESS: single 
(0) TOPOLOGY: linear 

10 (ii) MOLECULE TYPE: DMA (genomic) 



15 



20 



25 



35 



40 



<xi) SEQUENCE DESCRIPTION: SEQ 10 N0:1: 

TCGCGATTGC AGTGGCACTG GCTGGTTTCG CTACCGTAGC GCAGGCCGCG AATTTGGCCG 60 

AAGAAGCTTT CGACCTCTGG AACGAATGCG CCAAAGCCTG CGTGCTCGAC CTCAAGGACG 120 

GCGTGCGTTC CAGCCGCATG AGCGTCGACC CGGCCATCGC CGACACCAAC GGCCAGGGCG 180 

TGCTGCACTA CTCCATGGTC CTGGAGGGCG GCAACGACGC GCTCAAGCTG GCCATCGACA 240 

ACGCCCTCAG CATCACCAGC GACGGCCTGA CCATCCGCCT CGAAGGCGGC GTCGAGCCGA 300 

ACAAGCCGGT GCGCTACAGC TACACGCGCC AGGCGCGCGG CAGTTGGTCG CTGAACTGGC 360 

TGGTACCGAT CGGCCACGAG AAGCCCTCGA ACATCAAGGT GTTCATCCAC GAACTGAACG 420 

30 CCGGCAACCA GCTCAGCCAC ATGTCGCCGA TCTACACCAT CGAGATGGGC GACGAGTTGC 480 

TGGCGAAGCT GGCGCGCGAT GCCACCTTCT TCGTCAGGGC GCACGAGAGC AACGAGATGC S40 

AGCCGACGCT CGCCATCAGC CATGCCGGGG TCAGCGTGGT CATGGCCCAG ACCCAGCCGC 600 

GCCGGGAAAA GCGCTGGAGC GAATGGGCCA GCGGCAAGGT GTTGTGCCTG CTCGACCCGC 660 

TGGACGGGGT CTACAACTAC CTCGCCCAGC AACGCTGCAA CCTCGACGAT ACCTGGGAAG 720 

GCAAGATCTA CCGGGTGCTC GCCGGCAACC CGGCGAAGCA TGACCTGGAC ATCAAACCCA 780 

CGGTCATCAG TCATCGCCTG CACTTTCCCG AGGGCGGCAG CCTGGCCGCG CTGACCGCGC 840 

ACCAGGCTTG CCACCTGCCG CTGGAGACTT TCACCCGTCA TCGCCAGCCG CGCGGCTGGG 900 

45 AACAACTGGA GCAGTGCGGC TATCCGGTGC AGCGGCTGGT CGCCCTCTAC CTGGCGGCGC 960 

GGCTGTCGTG GAACCAGGTC GACCAGGTGA TCCGCAACGC CCTGGCCAGC CCCGGCAGCG 1020 



50 



55 
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GCGGCGACCT GGGCGAAGCG ATCCGCGAGC AGCCGGAGCA GGCCCGTCTG GCCCTGACCC 
TGGCCGCCGC CGAGAGCGAG CGCTTCGTCC GGCAGGGCAC CGGCAACGAC GAGGCCGGCG 
CGGCCAACGC CGACGTGGTG AGCCTGACCT GCCCGGTCGC CGCCGGTGAA TGCGCGGGCC 
CGGCGGACAG CGGCGACGCC CTGCTGGAGC GCAACTATCC .CACTGGCGCG GAGTTCCTCG 
GCGACGGCGG CGACGTCAGC TTCAGCACCC GCGG 
(2) INFORMATION FOR SEQ 10 N0:2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 759 base pairs 

(B ) TYFE: nucleic acid 

(C) STRANOEONESS: single 
(0) TOPOLOGY: linear 

(H) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ 10 N0:2: 
ATGAGTCTTC TAACCGAGGT CGAAACGTAC GTTCTCTCTA TCATCCCGTC AGGCCCCCTC 
AAA GCC GAGA TCGCACAGAG ACTTGAAGAT GTCTTTGCAG GGAAGAACAC CGATCTTGAG 
GTTCTCATGG AATGGCTAAA GACAAGACCA ATCCTGTCAC CTCTGACTAA GGGGATTTTA 
GGATTTGTGT TCACGCTCAC CGTGCCCAGT GAGCGAGGAC TGCAGCGTAG ACGCTTTGTC 
CAAAATGCCC TTAATGGGAA CGGGGATCCA AATAACATGG ACAAAGCAGT TAAACTGTAT 
AGGAAGCTCA AGAGGGAGAT AACATTCCAT GGGGCCAAAG AAATCTCACT CAGTTATTCT 
GCTGGTGCAC TTGCCAGTTG TATGGGCCTC ATATACAACA GGATGGGGGC TGTGACCACT 
GAAGTGGCAT TTGGCCTGGT ATGTGCAACC TGTGAACAGA TTGCTGACTC CCAGCATCGG 
TCTCATAGGC AAATGGTGAC AACAACCAAC CCACTAATCA GACATGAGAA CAGAATGGTT 
TTAGCCAGCA CTACAGCTAA GGCTATGGAG CAAATGGCTG GATCGAGTGA GCAAGCAGCA 
GAGGCCATGG AGGTTGCTAG TCAGGCTAGG CAAATGGTGC AAGCGATGAG AACCATTGGG 
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ACTCATCCTA GCTCCAGTGC TGGTCTGAAA AATGATCTTC TTGAAAATTT GCAGGCCTAT 720 

CAGAAACGAA TGGGGGTGCA GATGCAACGG TTCAAGTGA 759 

(2) INFORMATION FOR SEQ 10 NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 253 amino acids 
(6) TYPE: amino acid 

(C) STRANOEDNESS: single 

(D) TOPOLOGY: linear 

(H) MOLECULE TYPE: protein 



<xi) SEQUENCE DESCRIPTION: SEQ 10 N0:3: 

20 Met Ser Leu Leu Thr Glu Val Glu Thr Tyr Val Leu Ser lie He Pro 

1 5 10 15 

Ser Gly Pro Leu Lys Ala Glu He Ala Gin Arg Leu Glu Asp Val Phe 
20 25 30 



Ala Gly Lys Asn Thr Asp Leu Glu Val Leu Met Glu Trp Leu Lys Thr 
35 40 45 

Arg Pro He Leu Ser Pro Leu Thr Lys Gly He Leu Gly Phe Val Phe 
50 55 60 

Thr Leu Thr Val Pro Ser Glu Arg Gly Leu Gin Arg Arg Arg Phe Val 
65 70 75 80 

Gin Asn Ala Leu Asn Gly Asn Gly Asp Pro Asn Asn Met Asp Lys Ala 
85 90 95 

Val Lys Leu Tyr Arg Lys Leu Lys Arg Glu He Thr Phe His Gly Ala 
100 105 HO 

Lys Glu He Ser Leu Ser Tyr Ser Ala Gly Ala Leu Ala Ser Cys Met 
115 120 125 

Gly Leu He Tyr Asn Arg Met Gly Ala Val Thr Thr Glu Val Ala Phe 
130 135 140 

Gly Leu Val Cys Ala Thr Cys Glu Gin He Ala Asp Ser Gin His Arg 
145 150 155 160 
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Ser His Arg Gin Het Val Thr Thr Thr Asn Pro Leu He Arg His Glu 
165 170 175 

Asn Arg Met Val Leu Ala Ser Thr Thr Ala Lys Ala Met Glu Gin Met 
180 185 190 

Ala Gly Ser Ser Glu Gin Ala Ala Glu Ala Met Glu Val Ala Ser Gin 
195 200 205 

Ala Arg Gin Het Val Gin Ala Met Arg Thr He Gly Thr His Pro Ser 
210 215 220 

Ser Ser Ala Gly Leu Lys Asn Asp Leu Leu Glu Asn Leu Gin Ala Tyr 
225 230 235 240 

Gin Lys Arg Het Gly Val Gin Met Gin Arg Phe Lys Xaa 
245 250 

(2) INFORMATION FOR SEQ ID N0:4: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 30 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEONESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: ONA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:4: 

ATACCCGCGG CAGTCTTCTA ACCGAGGTCG 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 36 base pairs 
(6) TYPE: nucleic acid 

(C) STRAN0EDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: ONA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:5: 
CCCCACGTCT ACGTTGCCAA GTTCACTCTC GAGATA 
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(2) INFORMATION FOR SEQ 10 N0:6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANOEDNESS: single 
(0) TOPOLOGY: linear 

(H) MOLECULE tYPE : OH A (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ 10 N0:6: 
CTCGAGAATT CATGGCCGAG GAAGCTT 
(2) INFORMATION FOR SEQ 10 N0:7: 

20 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1998 base pairs 

(B) TYPE: nucleic acid 

(C) STRANOEDNESS: single 
(0) TOPOLOGY: linear 



(ii) MOLECULE TYPE: ONA (genomic) 



27 



30 ( X i) SEQUENCE DESCRIPTION: SEQ ID N0:7: 

ATGGCCGAAG AAGCTTTCGA CCTCTGGAAC GAATGCGCCA AAGCCTGCGT GCTCGACCTC 60 

AAGGACGGCG TGCGTTCCAG CCGCATGAGC GTCGACCCGG CCATCGCCGA CACCAACGGC 120 

CAGGGCGTGC TGCACTACTC CATGGTCCTG GAGGGCGGCA ACGACGCGCT CAAGCTGGCC 180 

ATCGACAACG CCCTCAGCAT CACCAGCGAC GGCCTGACCA TCCGCCTCGA AGGCGGCGTC 240 

GAGCCGAACA AGCCGGTGCG CTACAGCTAC ACGCGCCAGG CGCGCGGCAG TTGGTCGCTG 300 

AACTGGCTGG TACCGATCGG CCACGAGAAG CCCTCGAACA TCAAGGTGTT CATCCACGAA 360 

CTGAACGCCG GCAACCAGCT CAGCCACATG TCGCCGATCT ACACCATCGA GATGGGCGAC 420 

GAGTTGCTGG CGAAGCTGGC GCGCGATGCC ACCTTCTTCG TCAGGGCGCA CGAGAGCAAC 480 

GAGATGCAGC CGACGCTCGC CATCAGCCAT GCCGGGGTCA GCGTGGTCAT GGCCCAGACC 540 
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CAGCCGCGCC GGGAAAAGCG CTGGAGCGAA 
GACCCGCTGG ACGGGGTCTA CAACTACCTC 

5 

TGGGAAGGCA . AGATCTACCG GGTGCTCGCC 
AAACCCACGG TCATCAGTCA TCGCCTGCAC 
ACCGCGCACC AGGCTTGCCA CCTGCCGCTG 

10 

GGCTGGGAAC AACTGGAGCA GTGCGGCTAT 
GCGGCGCGGC TGTCGTGGAA CCAGGTCGAC 
15 GGCAGCGGCG GCGACCTGGG CGAAGCGATC 

CTGACCCTGG CCGCCGCCGA GAGCGAGCGC 
GCCGGCGCGG CCAACGCCGA CGTGGTGAGC 

20 

GCGGGCCCGG CGGACAGCGG CGACGCCCTG 
TTCCTCGGCG ACGGCGGCGA CGTCAGCTTC 
25 GAAACGTACG TTCTCTCTAT CATCCCGTCA 

CTTGAAGATG TCTTTGCAGG GAAGAACACC 
ACAAGACCAA TCCTGTCACC TCTGACTAAG 

30 

GTGCCCAGTG AGCGAGGACT GCAGCGTAGA 
GGGGATCCAA ATAACATGGA CAAAGCAGTT 
35 ACATTCCATG GGGCCAAAGA AATCTCACTC 

ATGGGCCTCA TATACAACAG GATGGGGGCT 
TGTGCAACCT GTGAACAGAT TGCTGACTCC 

40 

ACAACCAACC CACTAATCAG ACATGAGAAC 
GCTATGGAGC AAATGGCTGG ATCGAGTGAG 
45 CAGGCTAGGC AAATGGTGCA AGCGATGAGA 

GGTCTGAAAA ATGATCTTCT TGAAAATTTG 
ATGCAACGGT TCAAGTGA 



TGGGCCAGCG GCAAGGTGTT GTGCCTGCTC 600 

GCCCAGCAAC GCTGCAACCT CGACGATACC 660 

GGCAACCCGG CGAAGCATGA CCTGGACATC 720 

TTTCCCGAGG GCGGCAGCCT GGCCGCGCTG 780 

GAGACTTTCA CCCGTCATCG CCAGCCGCGC 840 

CCGGTGCAGC GGCTGGTCGC CCTCTACCTG 900 

CAGGTGATCC GCAACGCCCT GGCCAGCCCC 960 

CGCGAGCAGC CGGAGCAGGC CCGTCTGGCC 1020 

TTCGTCCGGC AGGGCACCGG CAACGACGAG 1080 

CTGACCTGCC CGGTCGCCGC CGGTGAATGC 1140 

CTGGAGCGCA ACTATCCCAC TGGCGCGGAG 1200 

AGCACCCGCG GCAGTCTTCT AACCGAGGTC 1260 

GGCCCCCTCA AAGCCGAGAT CGCACAGAGA 1320 

GATCTTGAGG TTCTCATGGA ATGGCTAAAG * 1380 

GGGATTTTAG GATTTGTGTT CACGCTCACC 1440 

CGCTTTGTCC AAAATGCCCT TAATGGGAAC 1500 

AAACTGTATA GGAAGCTCAA GAGGGAGATA 1560 

AGTTATTCTG CTGGTGCACT TGCCAGTTGT 1620 

GTGACCACTG AAGTGGCATT TGGCCTGGTA 1680 

CAGCATCGGT CTCATAGGCA AATGGTGACA 1740 

AGAATGGTTT TAGCCAGCAC TACAGCTAAG 1800 

CAAGCAGCAG AGGCCATGGA GGTTGCTAGT 1860 

ACCATTGGGA CTCATCCTAG CTCCAGTGCT 1920 

CAGGCCTATC AGAAACGAAT GGGGGTGCAG 1980 

1998 
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INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 666 amino acids 

(B) TYPE: amino acid 

(C) STRANOEONESS: single 
(0) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:8: 

Met Ala Glu Glu Ala Phe Asp Leu Trp Asn Glu Cys Ala Lys Ala Cys 
1 5 10 15 

Val Leu Asp Leu Lys Asp Gly Val Arg Ser Ser Arg Met Ser Val Asp 
20 25 30 

Pro Ala He Ala Asp Thr Asn Gly Gin Gly Val Leu His Tyr Ser Met 
35 40 45 

Val Leu Glu Gly Gly Asn Asp Ala Leu Lys Leu Ala lie Asp Asn Ala 
50 55 60 

Leu Ser He Thr Ser Asp Gly Leu Thr lie Arg Leu Glu Gly Gly Val 
65 70 75 80 

Glu Pro Asn Lys Pro Val Arg Tyr Ser Tyr Thr Arg Gin Ala Arg Gly 
85 90 95 

Ser Trp Ser Leu Asn Trp Leu Val Pro He Gly His Glu Lys Pro Ser 
100 105 HO 

Asn lie Lys Val Phe lie His Glu Leu Asn Ala Gly Asn Gin Leu Ser 
115 120 125 

His Met Ser Pro He Tyr Thr He Glu Met Gly Asp Glu Leu Leu Ala 
130 135 140 

Lys Leu Ala Arg Asp Ala Thr Phe Phe Val Arg Ala His Glu Ser Asn 
145 150 155 160 

Glu Met Gin Pro Thr Leu Ala lie Ser His Ala Gly Val Ser Val Val 
165 170 175 

Met Ala Gin Thr Gin Pro Arg Arg Glu Lys Arg Trp Ser Glu Trp Ala 
180 185 190 
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Ser Gly Lys Val Leu Cys Leu Leu Asp Pro Leu Asp Gly Val Tyr Asn 
195 200 205 

Tyr Leu Ala Gin Gin Arg Cys Asn Leu Asp Asp Thr Trp Glu Gly Lys 
210 215 220 

lie Tyr Arg Val Leu Ala Gly Asn Pro Ala. Lys His Asp Leu Asp He 
225 230 235 240 

Lys Pro Thr Val He Ser His Arg Leu His Phe Pro Glu Gly Gly Ser 
245 250 255 

Leu Ala Ala Leu Thr Ala His Gin Ala Cys His Leu Pro Leu Glu Thr 
260 265 270 

Phe Thr Arg His Arg Gin Pro Arg Gly Trp Glu -Gin Leu Glu Gin Cys 
275 280 285 

Gly Tyr Pro Val Gin Arg Leu Val Ala Leu Tyr Leu Ala Ala Arg Leu 
290 295 300 

Ser Trp Asn Gin Val Asp Gin Val lie Arg Asn Ala Leu Ala Ser Pro 
305 310 315 320 

Gly Ser Gly Gly Asp Leu Gly Glu Ala lie Arg Glu Gin Pro Glu Gin 
325 330 335 

Ala Arg Leu Ala Leu Thr Leu Ala Ala Ala Glu Ser Glu Arg Phe Val 
340 345 350 

Arg^Gln Gly Thr Gly Asn Asp Glu Ala Gly Ala Ala Asn Ala Asp Val 
355 360 365 

Val Ser Leu Thr Cys Pro Val Ala Ala Gly Glu Cys Ala Gly Pro Ala 
370 375 380 

Asp Ser Gly Asp Ala Leu Leu Glu Arg Asn Tyr Pro Thr Gly Ala Glu 
385 390 395 400 

Phe Leu Gly Asp Gly Gly Asp Val Ser Phe Ser Thr Arg Gly Ser Leu 
405 410 415 



Leu Thr Glu Val Glu Thr Tyr Val Leu Ser He He Pro Ser Gly Pro 

420 425 430 

Leu Lys Ala Glu He Ala Gin Arg Leu Glu Asp Val Phe Ala Gly Lys 

435 440 445 
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Asn Thr Asp Leu Glu Val Leu Met Glu Trp Leu Lys Thr Arg Pro lie 
450 455 460 

Leo Ser Pro Leu Thr Lys Gly He Leu Gly Phe Val Phe Thr Leu Thr 
465 470 475 480 

Val Pro Ser Glu Arg Gly Leu Gin Arg Arg Arg Phe Val Gin Asn Ala 
485 490 495 

Leu Asn Gly Asn Gly Asp Pro Asn Asn Met Asp Lys Ala Val Lys Leu 
500 505 510 

Tyr Arg Lys Leu Lys Arg Glu He Thr Phe His Gly Ala Lys Glu He 
515 520 525 

Ser Leu Ser Tyr Ser Ala Gly Ala Leu Ala "Ser Cys Met Gly Leu He 
530 535 540 

Tyr Asn Arg Met Gly Ala Val Thr Thr Glu Val Ala Phe Gly Leu Val 
20 545 " 550 555 560 

Cys Ala Thr Cys Glu Gin He Ala Asp Ser Gin His Arg Ser His Arg 
565 570 575 

25 Gin Met Val Thr Thr Thr Asn Pro Leu He Arg His Glu Asn Arg Met 

580 585 590 

Val Leu Ala Ser Thr Thr Ala Lys Ala Met Glu Gin Met Ala Gly Ser 
595 600 605 

30 

Ser Glu Gin Ala Ala Glu Ala Met Glu Val Ala Ser Gin Ala Arg Gin 
610 615 620 

Met Val Gin Ala Met Arg Thr He Gly Thr His Pro Ser Ser Ser Ala 
35 625 630 635 640 

Gly Leu Lys Asn Asp Leu Leu Glu Asn Leu Gin Ala Tyr Gin Lys Arg 
645 650 655 

40 Met Gly Val Gin Met Gin Arg Phe Lys Xaa 

660 665 

(2) INFORMATION FOR SEQ 10 N0:9: 

45 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 52 base pairs 
(6) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

50 
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( i i ) MOLECULE TYPE: DNA (genomic) 



(xi ) SEQUENCE DESCRIPTION: SEQ ID NO:9: 
CTAGAAATAA TTTTGTTTAA CTTTAAGAAG GAGATATACA TATGGCCGAA GA 
(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEONESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ 10 NO: 10: 
ATACCCGCGG CAAGGGGATT TTAGGATTTG TG 
(2) INFORMATION FOR SEQ ID N0:11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEONESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

ATAGAGCTCT CACACGGTGA GCGTGAACAC AAATCC 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 
<A) LENGTH: 52 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEONESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: ONA (genomic) 

(xi ) SEQUENCE DESCRIPTION: SEQ 10 NO: 12: 
CCGCGGCAAG GGGATTTTAG GATTTGTGTT CACGCTCACC GTGTGAGAGC TC 
(2) INFORMATION FOR SEQ 10 NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 52 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEONESS: single < 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: ONA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ 10 NO: 13: 
CTAGAAATAA TTTTGTTTAA CTTTAAGAAG GAGATATACA TATGGCCGAA GA 
(2) INFORMATION FOR SEQ 10 NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1281 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
(0) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
ATGGCCGAGG AAGCTTTCGA CCTCTGGAAC GAATGCGCCA AAGCCTGCGT GCTCGACCTC 
AAGGACGGCG TGCGTTCCAG CCGCATGAGC GTCGACCCGG CCATCGCCGA CACCAACGGC 
CAGGGCGTGC TGCACTACTC CATGGTCCTG GAGGGCGGCA ACGACGCGCT CAAGCTGGCC 
ATCGACAACG CCCTCAGCAT CACCAGCGAC GGCCTGACCA TCCGCCTCGA AGGCGGCGTC 
GAGCCGAACA AGCCGGTGCG CTACAGCTAC ACGCGCCAGG CGCGCGGCAG TTGGTCGCTG 
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AACTGGCTGG TACCGATCGG CCACGAGAAG 
CTGAACGCCG GCAACCAGCT CAGCCACATG 
GAGTTGCTGG CGAAGCTGGC GCGCGATGCC 
GAGATGCAGC CGACGCTCGC CATCAGCCAT 
CAGCCGCGCC GGGAAAAGCG CTGGAGCGAA 
GACCCGCTGG ACGGGGTCTA CAACTACCTC 
TGGGAAGGCA AGATCTACCG GGTGCTCGCC 
AAACCCACGG TCATCAGTCA TCGCCTGCAC 
ACCGCGCACC AGGCTTGCCA CCTGCCGCTG 
GGCTGGGAAC AACTGGAGCA GTGCGGCTAT 
GCGGCGCGGC TGTCGTGGAA CCAGGTCGAC 
GGCAGCGGCG GCGACCTGGG CGAAGCGATC 
CTGACCCTGG CCGCCGCCGA GAGCGAGCGC 
GCCGGCGCGG CCAACGCCGA CGTGGTGAGC 
GCGGGCCCGG CGGACAGCGG CGACGCCCTG 
TTCCTCGGCG ACGGCGGCGA CGTCAGCTTC 
GTGTTCACGC TCACCGTGTG A 
(2) INFORMATION FOR SEQ ID NO: 15: 



CCCTCGAACA TCAAGGTGTT CATCCACGAA 
TCGCCGATCT ACACCATCGA GATGGGCGAC 
ACCTTCTTCG TCAGGGCGCA CGAGAGCAAC 
GCCGGGGTCA GCGTGGTCAT GGCCCAGACC 
TGGGCCAGCG GCAAGGTGTT GTGCCTGCTC 
GCCCAGCAAC GCTGCAACCT CGACGATACC 
GGCAACCCGG CGAAGCATGA CCTGGACATC 
TTTCCCGAGG GCGGCAGCCT GGCCGCGCTG 
GAGACTTTCA CCCGTCATCG CCAGCCGCGC 
CCGGTGCAGC GGCTGGTCGC CCTCTACCTG 
CAGGTGATCC GCAACGCCCT GGCCAGCCCC 
CGCGAGCAGC CGGAGCAGGC CCGTCTGGCC 
TTCGTCCGGC AGGGCACCGG CAACGACGAG 
CTGACCTGCC CGGTCGCCGC CGGTGAATGC 
CTGGAGCGCA ACTATCCCAC TGGCGCGGAG 
AGCACCCGCG GCAAGGGGAT TTTAGGATTT 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 427 amino acids 

(B) TYPE: amino acid 

(C) STRANOEONESS: single 
(0) TOPOLOGY: linear 

(ii) MOLECULE TYPE : protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

Met Ala Glu Glu Ala Phe Asp Leu Trp Asn Glu Cys Ala Lys Ala Cys 
1 5 10 15 

Val Leu Asp Leu Lys Asp Gly Val Arg Ser Ser Arg Met Ser Val Asp 
20 25 30 

Pro Ala lie Ala Asp Thr Asn Gly Gin Gly Val Leu His Tyr Ser Met 
35 40 45 

Val Leu Glu Gly Gly Asn Asp Ala Leu Lys Leu Ala He Asp Asn Ala 
50 55 60 

Leu Ser He Thr Ser Asp Gly Leu Thr He Arg Leu Glu Gly Gly Val 
65 70 * 75 80 

Glu Pro Asn Lys Pro Val Arg Tyr Ser Tyr Thr Arg Gin Ala Arg Gly 
85 90 95 

Ser Trp Ser Leu Asn Trp Leu Val Pro lie Gly His Glu Lys Pro Ser 
100 105 no 

Asn lie Lys Val Phe He His Glu Leu Asn Ala Gly Asn Gin Leu Ser 
115 120 125 

His Met Ser Pro He Tyr Thr lie Glu Met Gly Asp Glu Leu Leu Ala 
130 135 140 

Lys Leu Ala Arg Asp Ala Thr Phe Phe Val Arg Ala His Glu Ser Asn 
145 150 155 160 

Glu Met Gin Pro Thr Leu Ala He Ser His Ala Gly Val Ser Val Val 
165 170 175 

Met Ala Gin Thr Gin Pro Arg Arg Glu Lys Arg Trp Ser Glu Trp Ala 
180 185 190 

Ser Gly Lys Val Leu Cys Leu Leu Asp Pro Leu Asp Gly Val Tyr Asn 
195 200 205 

Tyr Leu Ala Gin Gin Arg Cys Asn Leu Asp Asp Thr Trp Glu Gly Lys 
210 215 220 

He Tyr Arg Val Leu Ala Gly Asn Pro Ala Lys His Asp Leu Asp He 
225 230 235 240 

Lys Pro Thr Val He Ser His Arg Leu His Phe Pro Glu Gly Gly Ser 
245 250 255 



27 



EP 0 541 335 A1 



Leu Ala Ala Leu Thr Ala His Gin Ala Cys His Leu Pro Leu Glu Thr 
260 265 270 

Phe Thr Arg His Arg Gin Pro Arg Gly Trp Glu Gin Leu Glu Gin Cys 
275 280 285 

Gly Tyr Pro Val Gin Arg Leu Val Ala Leu Tyr Leu Ala Ala Arg Leu 
290 295 300 

Ser Trp Asn Gin Val Asp Gin Val He Arg Asn Ala Leu Ala Ser Pro 
305 310 315 320 

Gly Ser Gly Gly Asp Leu Gly Glu Ala lie Arg Glu Gin Pro Glu Gin 
325 330 335 

Ala Arg Leu Ala Leu Thr Leu Ala Ala Ala Glu Ser Glu Arg Phe Val 
340 345 350 

Arq Gin Gly Thr Gly Asn Asp Glu Ala Gly Ala Ala Asn Ala Asp Val 
355 360 365 

Val Ser Leu Thr Cys Pro Val Ala Ala Gly Glu Cys Ala Gly Pro Ala 
370 375 380 

Asp Ser Gly Asp Ala Leu Leu Glu Arg Asn Tyr Pro Thr Gly Ala Glu 
385 390 395 400 

Phe Leu Gly Asp Gly Gly Asp Val Ser Phe Ser Thr Arg Gly Lys Gly 
405 410 415 

lie Leu Gly Phe Val Phe Thr Leu Thr Val Xaa 
420 425 

(2) INFORMATION FOR SEQ 10 NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANOEONESS: single 
(0) TOPOLOGY: linear 

<ii) MOLECULE TYPE: ONA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
GGCTGATAAT AGAGCTCG 
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(2) INFORMATION FOR SEQ 10 NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1245 base pairs 

(B) TYPE: nucleic acid 

(C) STRANOEONESS: single 
(0) TOPOLOGY: linear 

(ii) KOLECULE TYPE: ONA (genomic) 



(yi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

15 





ATGGCCGAGG 


AAGCTTTCGA 


CCTCTGGAAC 


GAATGLGCLA 


AAGCCTGCGT 


GCTCGACCTC 


60 




AAGGACGGCG 


TGCGTTCCAG 


CCGCATGAGC 




CCATCGCCGA 


CACCAACGGC 


120 


20 


CAGGGCGTGC 


TGCACTACTC 


CATGGTCCTG 


GAGGGLbuLA 


ACGACGCGCT 


CAAGCTGGCC 


180 




ATCGACAACG 


CCCTCAGCAT 


CACCAGCGAC 


GGLCTGALLA 


TCCGCCTCGA AGGCGGCGTC 


240 


25 


GAGCCGAACA 


AGCCGGTGCG 


CTACAGCTAC 


ACGCGCLAGG 


CGCGCGGCAG 


TTGGTCGCTG 


300 


AACTGGCTGG 


TACCGATCGG 


CCACGAGAAG 


CLCTLoAALA 


TCAAGGTGTT 


CATCCACGAA 


360 




CTGAACGCCG 


GCAACCAGCT 


CAGCCACATG 


TCGCCGATCT 


ACACCATCGA 


GATGGGCGAC 


420 


30 


GAGTTGCTGG 


CGAAGCTGGC 


GCGCGATGCC 


ACCTTCTTCG 


TCAGGGCGCA 


CGAGAGCAAC 


480 




GAGATGCAGC 


CGACGCTCGC 


CATCAGCCAT 


GCCGGGGTCA 


GCGTGGTCAT 


GGCCCAGACC 


540 




CAGCCGCGCC 


GGGAAAAGCG 


CTGGAGCGAA 


TGGGCCAGCG 


GCAAGGTGTT 


GTGCCTGCTC 


600 


35 


GACCCGCTGG 


ACGGGGTCTA 


CAACTACCTC 


GCCCAGCAAC 


GCTGCAACCT 


CGACGATACC 


660 




TGGGAAGGCA 


AGATCTACCG 


GGTGCTCGCC 


GGCAACCCGG 


CGAAGCATGA 


CCTGGACATC 


720 


40 


AAACCCACGG 


TCATCAGTCA 


TCGCCTGCAC 


TTTCCCGAGG 


GCGGCAGCCT 


GGCCGCGCTG 


780 




ACCGCGCACC 


AGGCTTGCCA 


CCTGCCGCTG 


GAGACTTTCA 


CCCGTCATCG 


CCAGCCGCGC 


840 




GGCTGGGAAC 


AACTGGAGCA 


GTGCGGCTAT 


CCGGTGCAGC 


GGCTGGTCGC 


CCTCTACCTG 


900 


45 


GCGGCGCGGC 


TGTCGTGGAA 


CCAGGTCGAC 


CAGGTGATCC 


GCAACGCCCT 


GGCCAGCCCC 


960 




GGCAGCGGCG 


GCGACCTGGG 


CGAAGCGATC 


CGCGAGCAGC 


CGGAGCAGGC 


CCGTCTGGCC 


1020 



50 
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CTGACCCTGG CCGCCGCCGA GAGCGAGCGC TTCGTCCGGC AGGGCACCGG CAACGACGAG 
GCCGGCGCGG CCAACGCCGA CGTGGTGAGC CTGACCTGCC CGGTCGCCGC CGGTGAATGC 
GCGGGCCCGG CGGACAGCGG CGACGCCCTG CTGGAGCGCA ACTATCCCAC TGGCGCGGAG 
TTCCTCGGCG ACGGCGGCGA CGTCAGCTTC AGCACCCGCG GCTGA 
(2) INFORMATION FOR SEQ ID NO: 18: 



<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 415 amino acids 

(B) TYPE: amino acid 

(C) STRANDEONESS: single 
(0) TOPOLOGY: linear 



(H) MOLECULE TYPE: protein 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

Met Ala Glu Glu Ala Phe Asp Leu Trp Asn Glu Cys Ala Lys Ala Cys 
15 10 15 

Val Leu Asp Leu Lys Asp Gly Val Arg Ser Ser Arg Met Ser Val Asp 
20 25 30 

Pro Ala lie Ala Asp Thr Asn Gly Gin Gly Val Leu His Tyr Ser Met 
35 40 45 

Val Leu Glu Gly Gly Asn Asp Ala Leu Lys Leu Ala lie Asp Asn Ala 
50 55 60 

Leu Ser lie Thr Ser Asp Gly Leu Thr He Arg Leu Glu Gly Gly Val 
65 70 75 80 

Glu Pro Asn Lys Pro Val Arg Tyr Ser Tyr Thr Arg Gin Ala Arg Gly 
85 90 95 

Ser Trp Ser Leu Asn Trp Leu Val Pro He Gly His Glu Lys Pro Ser 
100 105 HO 

Asn He Lys Val Phe He His Glu Leu Asn Ala Gly Asn Gin Leu Ser 
115 120 125 

His Met Ser Pro lie Tyr Thr He Glu Met Gly Asp Glu Leu Leu Ala 
130 135 140 
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Lys Leu Ala Arg Asp Ala Thr Phe Phe Val Arg Ala His Glu Ser Asn 
145 150 155 160 

Glu Met Gin Pro Thr Leu Ala He Ser His Ala Gly Val Ser Val Val 
,65 "0 175 

Met Ala Gin Thr Gin Pro Arg Arg Glo Lys Arg Trp Ser Glu Trp Ala 
180 185 190 

Ser Gly Lys Val Leu Cys Leu Leu Asp Pro Leu Asp Gly Val Tyr Asn 
195 200 205 

Tyr Leu Ala Gin Gin Arg Cys Asn Leu Asp Asp Thr Trp Glu Gly Lys 

215 220 



210 



lie Tyr Arg Val Le 

225 



u Ala Gly Asn Pro AlaT-ys His Asp Leu Asp He 



230 



235 



240 



Lys Pro 



Pro Thr Val He Ser His Arg Leu His Phe Pro Glu Gly Gly Ser 



245 



250 



255 



25 



30 



35 



40 



45 



Leu Ala Ala Leu Thr Ala His Gin Ala Cys His Leu Pro Leu Glu Thr 
260 265 270 

Phe Thr Arg His Arg Gin Pro Arg Gly Trp Glu Gin Leu Glu Gin Cys 
275 280 285 

Gly Tyr Pro Val Gin Arg Leu Val Ala Leu Tyr Leu Ala Ala Arg Leu 
290 295 300 

Ser Trp Asn Gin Val Asp Gin Val He Arg Asn Ala Leu Ala Ser Pro 
305 310 315 320 

Gly Ser Gly Gly Asp Leu Gly Glu Ala He Arg Glu Gin Pro Glu Gin 
325 330 335 

Ala Arg Leu Ala Leu Thr Leu Ala Ala Ala Glu Ser Glu Arg Phe Val 
340 345 350 

Arg Gin Gly Thr Gly Asn Asp Glu Ala Gly Ala Ala Asn Ala Asp Val 
355 360 365 

Val Ser Leu Thr Cys Pro Val Ala Ala Gly Glu Cys Ala Gly Pro Ala 
^70 375 380 

Asp Ser Gly Asp Ala Leu Leu Glu Arg Asn Tyr Pro Thr Gly Ala Glu 
385 390 395 400 
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Phe Leu Gly Asp Gly Gly Asp Val Ser Phe Ser Thr 
405 4*0 

(2) INFORMATION FOR SEQ ID NO: 19: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEONESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: ONA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
TCGAGCCGCC ACCATGGCCG AGGAA 
(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 46 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEONESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:20: 
GACCCGCTAG CACCCGGGAA ACCGCCGCGC GAGGACCTGA AGTAAG 
(2) INFORMATION FOR SEQ ID N0:21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1956 base pairs 

(B) TYPE: nucleic acid 

(C) STRANOEONESS: single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: DNA (genomic) 
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(xi ) SEQUENCE DESCRIPTION: SEQ ID N0:Z1: 





ATGCACCTGA TACCCCATTG GATCCCCCTG GTCGCCAGCC TCGGCCTGCT 


CGCCGGCGGC 


60 


5 


TCGTCCGCGT CCGCCGCCGA GGAAGCTTTC GACCTCTGGA 


ACGAATGCGC 


CAAAGCCTGC 


120 




GTGCTCGACC TCAAGGACGG CGTGCGTTCC AGCCGCATGA GCGTCGACCC GGCCATCGCC 


180 


10 


GACACCAACG GCCAGGGCGT GCTGCACTAC TCCATGGTCC 


TGGAGGGCGG 


CAACGACGCG 


240 




CTCAAGCTGG CCATCGACAA CGCCCTCAGC ATCACCAGCG 


ACGGCCTGAC 


CATCCGCCTC 


300 




GAAGGCGGCG TCGAGCCGAA CAAGCCGGTG CGCTACAGCT 


ACACGCGCCA 


GGCGCGCGGC 


360 


15 


AGTTGGTCGC TGAACTGGCT GGFACCGATC GGCCACGAGA 


AGCCCTCGAA 


CATCAAGGTG 


420 




TTCATCCACG AACTGAACGC CGGCAACCAG CTCAGCCACA 


TGTCGCCGAT 


CTACACCATC 


480 


20 


GAGATGGGCG ACGAGTTGCT GGCGAAGCTG GCGCGCGATG CCACCTTCTT 


CGTCAGGGCG 


540 




CACGAGAGCA ACGAGATGCA GCCGACGCTC GCCATCAGCC 


ATGCCGGGGT 


CAGCGTGGTC 


600 




ATGGCCCAGA CCCAGCCGCG CCGGGAAAAG CGCTGGAGCG 


AATGGGCCAG 


CGGCAAGGTG 


660 


25 


TTGTGCCTGC TCGACCCGCT GGACGGGGTC TACAACTACC 


TCGCCCAGCA 


ACGCTGCAAC 


720 




CTCGACGATA CCTGGGAAGG CAAGATCTAC CGGGTGCTCG 


CCGGCAACCC 


GGCGAAGCAT 


780 


30 


GACCTGGACA TCAAACCCAC GGTCATCAGT CATCGCCTGC 


ACTTTCCCGA 


GGGCGGCAGC 


840 




CTGGCCGCGC TGACC6CGCA CCAGGCTTGC CACCTGCCGC 


TGGAGACTTT 


CACCCGTCAT 


900 




CGCCAGCCGC GCGGCTGGGA ACAACTGGAG CAGTGCGGCT 


ATCCGGTGCA 


GCGGCTGGTC 


960 


35 


GCCCTCTACC TGGCGGCGCG GCTGTCGTGG AACCAGGTCG ACCAGGTGAT 


CCGCAACGCC 


1020 




CTGGCCAGCC CCGGCAGCGG CGGCGACCTG GGCGAAGCGA 


TCCGCGAGCA 


GCCGGAGCAG 


1080 


40 


GCCCGTCTGG CCCTGACCCT GGCCGCCGCC GAGAGCGAGC 


GCTTCGTCCG 


GCAGGGCACC 


1140 


GGCAACGACG AGGCCGGCGC GGCCAACGCC GACGTGGTGA 


GCCTGACCTG 


CCCGGTCGCC 


1200 




GCCGGTGAAT GCGCGGGCCC GGCGGACAGC GGCGACGCCC 


TGCTGGAGCG 


CAACTATCCC 


1260 


45 


ACTGGCGCGG AGTTCCTCGG CGACGGCGGC GACGTCAGCT 


TCAGCACCCG 


CGGCACGCAG 


1320 




AACTGGACGG TGGAGCGGCT GCTCCAGGCG CACCGCCAAC 


TGGAGGAGCG 


CGGCTATGTG 


1380 
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TTCGTCGGCT ACCACGGCAC CTTCCTCGAA GCGGCGCAAA GCATCGTCTT CGGCGGGGTG 1440 

CGCGCGCGCA GCCAGGACCT CGACGCGATC TGGCGCGGTT TCTATATCGC CGGCGATCCG 1500 

GCGCTGGCCT ACGGCTACGC CCAGGACCAG GAACCCGACG CACGCGGCCG GATCCGCAAC 1560 

GGTGCCCTGC TGCGGGTCTA TGTGCCGCGC TCGAGCCTGC CGGGCTTCTA CCGCACCAGC 1620 

CTGACCCTGG CCGCGCCGGA GGCGGCGGGC GAGGTCGAAC GGCTGATCGG CCATCCGCTG 1680 

CCGCTGCGCC TGGACGCCAT CACCGGCCCC GAGGAGGAAG GCGGGCGCCT GGAGACCATT 1740 

CTCGGCTGGC CGCTGGCCGA GCGCACCGTG GTGATTCCCT CGGCGATCCC CACCGACCCG 1800 

CGCAACGTCG GCGGCGACCT CGACCCGTCC AGCATCCCCG ACAAGGAACA GGCGATCAGC 1860 

GCCCTGCCGG ACTACGCCAG CCAGCCCGGC AAACCGCCGC GCGAGGACCC GCTAGCACCC 1920 

GGGAAACCGC CGCGCGAGGA CCTGAAGTAA GAATTC 1956 
(2) INFORMATION FOR SEQ ID NO: 22: 

<i) SEQUENCE CHARACTERISTICS.: 

(A) LENGTH: 652 amino acids 

(B) TYPE: amino acid 

(C) STRANOEDNESS: single 
(0) TOPOLOGY: linear 

<ii) MOLECULE TYPE: protein 



<xi)' SEQUENCE DESCRIPTION: SEQ ID N0:22: 

Met His Leu He Pro His Trp He Pro Leu Val Ala Ser Leu Gly Leu 
1 5 10 15 

Leu Ala Gly Gly Ser Ser Ala Ser Ala Ala Glu Glu Ala Phe Asp Leu 
20 25 30 

Trp Asn Glu Cys Ala Lys Ala Cys Val Leu Asp Leu Lys Asp Gly Val 
35 40 45 

Arg Ser Ser Arg Met Ser Val Asp Pro Ala He Ala Asp Thr Asn Gly 
50 55 60 

Gin Gly Val Leu His Tyr Ser Met Val Leu Glu Gly Gly Asn Asp Ala 
65 70 75 80 
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Leu Lys Leu Ala He Asp Asn Ala Leu Ser tie Thr Ser Asp GTy Leu 
85 90 95 

Thr He Arg Leu Glu Gly Gly Val Glu Pro Asn Lys Pro Val Arg Tyr 
100 105 HO 

Ser Tyr Thr Arg Gin Ala Arg Gly Ser Trp Ser Leu Asn Trp Leu Val 
115 120 125 

Pro He Gly His Glu Lys Pro Ser Asn He Lys Val Phe He His Glu 
130 135 HO 

Leu Asn Ala Gly Asn Gin Leu Ser His Met Ser Pro He Tyr Thr He 
145 'SO 155 160 

Glu Met Gly Asp Glu Leu Leu Ala Lys Leu A1* Arg Asp Ala Thr Phe 
165 170 1 ?5 

Phe Val Arg Ala His Glu Ser Asn Glu Het Gin Pro Thr Leu Ala He 
20 180 185 190 

Ser His Ala Gly Val Ser Val Val Met Ala Gin Thr Gin Pro Arg Arg 
195 200 205 

25 G1u Lys Arg Trp Ser Glu Trp Ala Ser Gly Lys Val Leu Cys Leu Leu 

210 215 220 

Asp Pro Leu Asp Gly Val Tyr Asn Tyr Leu Ala Gin Gin Arg Cys Asn 
225 230 235 240 

Leu Asp Asp Thr Trp Glu Gly Lys He Tyr Arg Val Leu Ala Gly Asn 
245 250 255 

Pro Ala Lys His Asp Leu Asp He Lys Pro Thr Val He Ser His Arg 
35 260 265 270 

Leu His Phe Pro Glu Gly Gly Ser Leu Ala Ala Leu Thr Ala His Gin 
275 280 285 



30 



40 



45 



Ala Cys His Leu Pro Leu Glu Thr Phe Thr Arg His Arg Gin Pro Arg 
2 go 295 300 

Gly Trp Glu Gin Leu Glu Gin Cys Gly Tyr Pro Val Gin Arg Leu Val 
305 310 315 320 

Ala Leu Tyr Leu Ala Ala Arg Leu Ser Trp Asn Gin Val Asp Gin Val 
325 330 335 
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10 



15 



lie Arg Asn Ala Leu Ala Ser Pro Gly Ser Gly Gly Asp Leu Gly Glu 

340 345 350 

Ala He Arg Glu Gin Pro Glu Gin Ala Arg Leu Ala Leu Thr Leu Ala 

355 360 365 

Ala Ala Glu Ser Glu Arg Phe Val Arg Gin Gly Thr Gly Asn Asp Glu 

370 375 380 

Ala Gly Ala Ala Asn Ala Asp Val Val Ser Leu Thr Cys Pro Val Ala 

385 390. 395 400 

Ala Gly Glu Cys Ala Gly Pro Ala Asp Ser Gly Asp Ala Leu Leu Glu 



405 



410 



415 



20 



25 



30 



Arg Asn Tyr Pro Thr Gly Ala Glu Phe Leu Gly Asp Gly Gly Asp Val 
420 425 430 

Ser Phe Ser Thr Arg Gly Thr Gin Asn Trp Thr Val Glu Arg Leu Leu 
435 440 445 

Gin Ala His Arg Gin Leu Glu Glu Arg Gly Tyr Val Phe Val Gly Tyr 
450 455 460 

His Gly Thr Phe Leu Glu Ala Ala Gin Ser lie Val Phe Gly Gly Val 
465 470 475 480 

Ara Ala Arg Ser Gin Asp Leu Asp Ala He Trp Arg Gly Phe Tyr lie 
485 490 495 



35 



40 



45 



Ala Gly Asp Pro Ala Leu Ala Tyr Gly Tyr Ala Gin Asp Gin Glu Pro 

500 505 510 

Asp Ala Arg Gly Arg lie Arg Asn Gly Ala Leu Leu Arg Val Tyr Val 
515 520 525 

Pro Arg Ser Ser Leu Pro Gly Phe Tyr Arg Thr Ser Leu Thr Leu Ala 
530 535 540 

Ala Pro Glu Ala Ala Gly Glu Val Glu Arg Leu He Gly His Pro Leu 
545 550 555 560 

Pro Leu Arg Leu Asp Ala He Thr Gly Pro Glu Glu Glu Gly Gly Arg 
565 570 575 

Leu Glu Thr lie Leu Gly Trp Pro Leu Ala Glu Arg Thr Val Val lie 

580 585 590 
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Pro Ser Ala lie Pro Thr Asp Pro Arg Asn Val Gly Gly Asp Leu Asp 
595 600 605 

Pro Ser Ser He Pro Asp Lys Glu Gin Ala He Ser Ala Leu Pro Asp 
610 615 620 

Tyr Ala Ser Gin Pro Gly Lys Pro Pro Arg Glu Asp Pro Leu Ala Pro 
625 630 635 640 

Gly Lys Pro Pro Arg Glu Asp Leu Lys Xaa Glu Phe 
645 650 



(2) INFORMATION FOR SEQ 10 NO: 23: 

15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 48 base pairs 

(B) TYPE: nucleic acid 

(C) STRANOEDNESS: single 

(D) TOPOLOGY: linear 

20 

(ii) MOLECULE TYPE: ONA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:23: 
CCGGGCTGAC TAAGGGGATT TTAGGATTTG TGTTCACGCT CACCGTGC 48 
(2) INFORMATION FOR SEQ 10 N0:24: 

(i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2004 base pairs 

(B) TYPE: nucleic acid 

(C) STRANOEDNESS: single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ 10 N0:24; 

ATGCACCTGA TACCCCATTG GATCCCCOG GTCGCCAGCC TCGGCCTGCT CGCCGGCGGC 60 

TCGTCCGCGT CCGCCGCCGA GGAAGCTTTC GACCTCTGGA ACGAATGCGC CAAAGCCTGC 120 

GTGCTCGACC TCAAGGACGG CGTGCGTTCC AGCCGCATGA GCGTCGACCC GGCCATCGCC 180 
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GACACCAACG GCCAGGGCGT GCTGCACTAC 
CTCAAGCTGG CCATCGACAA CGCCCTCAGC 
5 GAAGGCGGCG TCGAGCCGAA CAAGCCGGTG 

AGTTGGTCGC TGAACTGGCT GGTACCGATC 
TTCATCCACG AACTGAACGC CGGCAACCAG 

10 

GAGATGGGCG ACGAGTTGCT GGCGAAGCTG 
CACGAGAGCA ACGAGATGCA GCCGACGCTC 
15 ATGGCCCAGA CCCAGCCGCG CCGGGAAAAG 

TTGTGCCTGC TCGACCCGCT GGACGGGGTC 
CTCGACGATA CCTGGGAAGG CAAGATCTAC 

20 

GACCTGGACA TCAAACCCAC GGTCATCAGT 
CTGGCCGCGC TGACCGCGCA CCAGGCTTGC 
25 CGCCAGCCGC GCGGCTGGGA ACAACTGGAG 

GCCCTCTACC TGGCGGCGCG GCTGTCGTGG 
CTGGCCAGCC CCGGCAGCGG CGGCGACCTG 

30 

GCCCGTCTGG CCCTGACCCT GGCCGCCGCC 
GGCAACGACG AGGCCGGCGC GGCCAACGCC 
35 GCCGGTGAAT GCGCGGGCCC GGCGGACAGC 

ACTGGCGCGG AGTTCCTCGG CGACGGCGGC 
AACTGGACGG TGGAGCGGCT GCTCCAGGCG 

40 

TTCGTCGGCT ACCACGGCAC CTTCCTCGAA 
CGCGCGCGCA GCCAGGACCT CGACGCGATC 
GCGCTGGCCT ACGGCTACGC CCAGGACCAG 

45 

GGTGCCCTGC TGCGGGTCTA TGTGCCGCGC 



TCCATGGTCC TGGAGGGCGG CAACGACGCG 240 

ATCACCAGCG ACGGCCTGAC CATCCGCCTC 300 

CGCTACAGCT ACACGCGCCA GGCGCGCGGC 360 

GGCCACGAGA AGCCCTCGAA CATCAAGGTG 420 

CTCAGCCACA TGTCGCCGAT CTACACCATC 480 

GCGCGCGATG CCACCTTCTT CGTCAGGGCG 540 

GCCATCAGCC ATGCCGGGGT CAGCGTGGTC 600 

CGCTGGAGCG AATGGGCCAG CGGCAAGGTG 660 

TACAACTACC TCGCCCAGCA ACGCTGCAAC 720 

CGGGTGCTCG CCGGCAACCC GGCGAAGCAT 780 

CATCGCCTGC ACTTTCCCGA GGGCGGCAGC 840 

CACCTGCCGC TGGAGACTTT CACCCGTCAT 900 

CAGTGCGGCT ATCCGGTGCA GCGGCTGGTC 960 

AACCAGGTCG ACCAGGTGAT CCGCAACGCC 1020 

GGCGAAGCGA TCCGCGAGCA GCCGGAGCAG 1080 

GAGAGCGAGC GCTTCGTCCG GCAGGGCACC 1140 

GACGTGGTGA GCCTGACCTG CCCGGTCGCC 1200 

GGCGACGCCC TGCTGGAGCG CAACTATCCC 1260 

GACGTCAGCT TCAGCACCCG CGGCACGCAG 1320 

CACCGCCAAC TGGAGGAGCG CGGCTATGTG 1380 

GCGGCGCAAA GCATCGTCTT CGGCGGGGTG 1440 

TGGCGCGGTT TCTATATCGC CGGCGATCCG 1500 

GAACCCGACG CACGCGGCCG GATCCGCAAC 1560 

TCGAGCCTGC CGGGCTTCTA CCGCACCAGC 1620 
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CTGACCCTGG CCGCGCCGGA GGCGGCGGGC GAGGTCGAAC GGCTGATCGG CCATCCGCTG 1680 

CCGGTGCGCC TGGACGCCAT CACCGGCCCC GAGGAGGAAG GCGGGCGCCT GGAGACCATT 1740 

5 

CTCGGCTGGC CGCTGGCCGA GCGCACCGTG GTGATTCCCT CGGCGATCCC CACCGACCCG 1800 

CGCAACGTCG GCGGCGACCT CGACCCGTCC AGCATCCCCG ACAAGGAACA GGCGATCAGC 1860 

10 GCCCTGCCGG ACTACGCCAG CCAGCCCGGC AAACCGCCGC GCGAGGACCC GCTAGCACCC 1920 

GGGCTGACTA AGGGGATTTT AGGATTTGTG TTCACGCTCA CCGTGCCCGG GAAACCGCCG 1980 

CGCGAGGACC TGAAGTAAGA ATTC 2004 
(2) INFORMATION FOR SEQ 10 N0:25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 668 amino acids 

(B) TYPE: amino acid 

(C) STRANOEONESS: single 

(D) TOPOLOGY: linear 



15 



20 



25 



(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ 10 N0:25: 

Met His Leu He Pro His Trp He Pro Leu Val Ala Ser Leu Gly Leu 
30 i 5 10 15 

Leu Ala Gly Gly Ser Ser Ala Ser Ala Ala Glu Glu Ala Phe Asp Leu 
20 25 30 

35 Trp Asn Glu Cys Ala Lys Ala Cys Val Leu Asp Leu Lys Asp Gly Val 

35 40 45 

Arg Ser Ser Arg Met Ser Val Asp Pro Ala lie Ala Asp Thr Asn Gly 
50 55 60 

40 

Gin Gly Val Leu His Tyr Ser Met Val Leu Glu Gly Gly Asn Asp Ala 
65 70 75 80 

Leu Lys Leu Ala He Asp Asn Ala Leu Ser lie Thr Ser Asp Gly Leu 
45 85 90 95 

Thr He Arg Leu Glu Gly Gly Val Glu Pro Asn Lys Pro Val Arg Tyr 
100 105 HO 

50 



55 



39 



EP 0 541 335 A1 



Ser Tyr Thr Arg Gin Ala Arg Gly Ser Trp Ser Leu Asn Trp Leu Val 
115 120 125 

Pro lie Gly His Glu Lys Pro Ser Asn He Lys Val Phe He His Glu 
130 135 140 

Leu Asn Ala Gly Asn Gin Leu Ser His Met Ser Pro lie Tyr Thr lie 
145 150 155 160 

Glu Met Gly Asp Glu Leu Leu Ala Lys Leu Ala Arg Asp Ala Thr Phe 
165 170 175 

Phe Val Arg Ala His Glu Ser Asn Glu Met Gin Pro Thr Leu Ala lie 
180 185 190 

Ser His Ala Gly Val Ser Val Val Met Ala Gin Thr Gin Pro Arg Arg 
195 200 205 

Glu Lys Arg Trp Ser Glu Trp Ala Ser Gly Lys Val Leu Cys Leu Leu 
210 215 220 

Asp Pro Leu Asp Gly Val Tyr Asn Tyr Leu Ala Gin Gin Arg Cys Asn 
225 230 235 240 

Leu Asp Asp Thr Trp Glu Gly Lys lie Tyr Arg Val Leu Ala Gly Asn 
245 250 255 

Pro Ala Lys His Asp Leu Asp He Lys Pro Thr Val lie Ser His Arg 
260 265 270 

Leu His Phe Pro Glu Gly Gly Ser Leu Ala Ala Leu Thr Ala His Gin 
275 280 285 

Ala Cys His Leu Pro Leu Glu Thr Phe Thr Arg His Arg Gin Pro Arg 
290 295 300 

Gly Trp Glu Gin Leu Glu Gin Cys Gly Tyr Pro Val Gin Arg Leu Val 
305 310 315 320 

Ala Leu Tyr Leu Ala Ala Arg Leu Ser Trp Asn Gin Val Asp Gin Val 
325 330 335 

lie Arg Asn Ala Leu Ala Ser Pro Gly Ser Gly Gly Asp Leu Gly Glu 
340 345 350 



Ala lie Arg Glu Gin Pro Glu Gin Ala Arg Leu Ala Leu Thr Leu Ala 

355 360 365 
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Ala Ala Glu Ser Glu Arg Phe Val Arg Gin Gly Thr Gly Asn Asp Glu 
370 375 380 

Ala Gly Ala Ala Asn Ala Asp Val Val Ser Leu Thr Cys Pro Val Ala 
385 390 395 400 

Ala Gly Glu Cys Ala Gly Pro Ala Asp- Ser Gly Asp Ala Leu Leu Glu 
405 410 415 

Arg Asn Tyr Pro Thr Gly Ala Glu Phe Leu Gly Asp Gly Gly Asp Val 
420 425 430 

Ser Phe Ser Thr Arg Gly Thr Gin Asn Trp Thr Val Glu Arg Leu Leu 
435 440 445 

Gin Ala His Arg Gin Leu Glu Glu Arg Gly Tyr Val Phe Val Gly Tyr 
450 455 460 

His Gly Thr Phe Leu Glu Ala Ala Gin Ser He Val Phe Gly Gly Val 
465 470 475 480 

Arg Ala Arg Ser Gin Asp Leu Asp Ala He Trp Arg Gly Phe Tyr lie 
485 490 495 

Ala Gly Asp Pro Ala Leu Ala Tyr Gly Tyr Ala Gin Asp Gin Glu Pro 
500 505 510 

Asp Ala Arg Gly Arg He Arg Asn Gly Ala Leu Leu Arg Val Tyr Val 
515 520 525 

Pro Arg Ser Ser Leu Pro Gly Phe Tyr Arg Thr Ser Leu Thr Leu Ala 
530 535 540 

Ala Pro Glu Ala Ala Gly Glu Val Glu Arg Leu He Gly His Pro Leu 
545 550 555 560 

Pro Leu Arg Leu Asp Ala He Thr Gly Pro Glu Glu Glu Gly Gly Arg 
565 570 575 

Leu Glu Thr He Leu Gly Trp Pro Leu Ala Glu Arg Thr Val Val He 
580 585 590 

Pro Ser Ala He Pro Thr Asp Pro Arg Asn Val Gly Gly Asp Leu Asp 
595 600 605 



Pro Ser Ser He Pro Asp Lys Glu Gin Ala He Ser Ala Leu Pro Asp 
610 615 620 
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Tyr Ala Ser Gin Pro Gly Lys Pro Pro Arg Glu Asp 
625 630 635 

Gly Leu Thr Lys Gly lie Leu Gly Phe Val Phe Thr 
645 650 

Gly Lys Pro Pro Arg Glu Asp Leu Lys Xaa-Glu Phe 
660 665 

(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
(O) TOPOLOGY: linear 

<ii) MOLECULE TYPE : ONA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:26: 
GCACCCGGGA TCCCGTCAGG CCCCCTC 
(2) INFORMATION FOR SEQ ID N0:27: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

, (C) STRANOEONESS: single 
<0) TOPOLOGY: linear 

(ii) MOLECULE TYPE: ONA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 
GCACCCGGGC TCCCTCTTGA GCTTCCT 
(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2238 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEONESS: single 
(0) TOPOLOGY: linear 
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( i i ) MOLECULE TYPE: ONA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:28: 

ATGCACCTGA TACCCCATTG GATCCCCCTG GTCGCCAGCC TCGGCCTGCT CGCCGGCGGC 60 

TCGTCCGCGT CCGCCGCCGA GGAAGCTTTC GACCTCTGGA ACGAATGCGC CAAAGCCTGC 120 

GTGCTCGACC TCAAGGACGG CGTGCGTTCC AGCCGCATGA GCGTCGACCC GGCCATCGCC 180 

GACACCAACG GCCAGGGCGT GCTGCACTAC TCCATGGTCC TGGAGGGCGG CAACGACGCG 240 

15 CTCAAGCTGG CCATCGACAA CGCCCTCAGC ATCACCAGCG ACGGCCTGAC CATCCGCCTC 300 

GAAGGCGGCG TCGAGCCGAA CAAGCCGGTG CGCTACAGCT ACACGCGCCA GGCGCGCGGC 360 

AGTTGGTCGC TGAACTGGCT GGTACCGATC GGCCACGAGA AGCCCTCGAA CATCAAGGTG 420 

20 

TTCATCCACG AACTGAACGC CGGCAACCAG CTCAGCCACA TGTCGCCGAT CTACACCATC 480 

GAGATGGGCG ACGAGTTGCT GGCGAAGCTG GCGCGCGATG CCACCTTCTT CGTCAGGGCG 540 

25 CACGAGAGCA ACGAGATGCA GCCGACGCTC GCCATCAGCC ATGCCGGGGT CAGCGTGGTC 600 

ATGGCCCAGA CCCAGCCGCG CCGGGAAAAG CGCTGGAGCG AATGGGCCAG CGGCAAGGTG 660 

TTGTGCCTGC TCGACCCGCT GGACGGGGTC TACAACTACC TCGCCCAGCA ACGCTGCAAC 720 

30 

CTCGACGATA CCTGGGAAGG CAAGATCTAC CGGGTGCTCG CCGGCAACCC GGCGAAGCAT 780 

GACCTGGACA TCAAACCCAC GGTCATCAGT CATCGCCTGC ACTTTCCCGA GGGCGGCAGC 840 

35 CTGGCCGCGC TGACCGCGCA CCAGGCTTGC CACCTGCCGC TGGAGACTTT CACCCGTCAT 900 

CGCCAGCCGC GCGGCTGGGA ACAACTGGAG CAGTGCGGCT ATCCGGTGCA GCGGCTGGTC 960 

GCCCTCTACC TGGCGGCGCG GCTGTCGTGG AACCAGGTCG ACCAGGTGAT CCGCAACGCC 1020 

CTGGCCAGCC CCGGCAGCGG CGGCGACCTG GGCGAAGCGA TCCGCGAGCA GCCGGAGCAG 1080 

GCCCGTCTGG CCCTGACCCT GGCCGCCGCC GAGAGCGAGC GCTTCGTCCG GCAGGGCACC 1140 

GGCAACGACG AGGCCGGCGC GGCCAACGCC GACGTGGTGA GCCTGACCTG CCCGGTCGCC 1200 

GCCGGTGAAT GCGCGGGCCC GGCGGACAGC GGCGACGCCC TGCTGGAGCG CAACTATCCC 1260 
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ACTGGCGCGG AGTTCCTCGG CGACGGCGGC GACGTCAGCT TCAGCACCCG CGGCACGCAG 
AACTGGACGG TGGAGCGGCT GCTCCAGGCG CACCGCCAAC TGGAGGAGCG CGGCTATGTG 
TTCGTCGGCT ACCACGGCAC CTTCCTCGAA GCGGCGCAAA GCATCGTCTT CGGCGGGGTG 
CGCGCGCGCA GCCAGGACCT CGACGCGATC TGGCGCGGTT TCTATATCGC CGGCGATCCG 
GCGCTGGCCT ACGGCTACGC CCAGGACCAG GAACCCGACG CACGCGGCCG GATCCGCAAC 
GGTGCCCTGC TGCGGGTCTA TGTGCCGCGC TCGAGCCTGC CGGGCTTCTA CCGCACCAGC 
CTGACCCTGG CCGCGCCGGA GGCGGCGGGC GAGGTCGAAC GGCTGATCGG CCATCCGCTG 
CCGCTGCGCC TGGACGCCAT CACCGGCCCC GAGGAGGAAG GCGGGCGCCT GGAGACCATT 
CTCGGCTGGC CGCTGGCCGA GCGCACCGTG GTGATTCCCT CGGCGATCCC CACCGACCCG 
CGCAACGTCG GCGGCGACCT CGACCCGTCC AGCATCCCCG ACAAGGAACA GGCGATCAGC 
GCCCTGCCGG ACTACGCCAG CCAGCCCGGC AAACCGCCGC GCGAGGACCC GCTAGCACCC 
GGGATCCCGT CAGGCCCCCT CAAAGCCGAG ATCGCACAGA GACTTGAAGA TGTCTTTGCA 
GGGAAGAACA CCGATCTTGA GGTTCTCATG GAATGGCTAA AGACAAGACC AATCCTGTCA 
CCTCTGACTA AGGGGATTTT AGGATTTGTG TTCACGCTCA CCGTGCCCAG TGAGCGAGGA 
CTGCAGCGTA GACGCTTTGT CCAAAATGCC CTTAATGGGA ACGGGGATCC AAATAACATG 
GACAAAGCAG TTAAACTGTA TAGGAAGCTC AAGAGGGAGC CCGGGAAACC GCCGCGCGAG 
GACCTGAAGT AAGAATTC 
(2) INFORMATION FOR SEQ ID NO:29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 746 amino acids 

(B) TYPE: amino acid 

(C) STRANOEDNESS: single 
(0) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



44 



EP 0 541 335 A1 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:29: 

Met His Leu He Pro His Trp lie Pro Leu Val Ala Ser Leu Gly Leu 
15 10 15 

Leu Ala Gly Gly Ser Ser Ala Ser Ala Ala Glu Glu Ala Phe Asp Leu 
20 25 30 

Trp Asn Glu Cys Ala Lys Ala Cys Val Leu Asp Leu Lys Asp Gly Val 
35 40 45 

Arg Ser Ser Arg Met Ser Val Asp Pro Ala lie Ala Asp Thr Asn Gly 
50 55 60 

Gin Gly Val Leu His Tyr Ser Met Val Leu GVu Gly Gly Asn Asp Ala 
65 70 75 80 

Leu Lys Leu Ala He Asp Asn Ala Leu Ser He Thr Ser Asp Gly Leu 
85 90 95 

Thr He Arg Leu Glu Gly Gly Val Glu Pro Asn Lys Pro Val Arg Tyr 
100 105 110 

Ser Tyr Thr Arg Gin Ala Arg Gly Ser Trp Ser Leu Asn Trp Leu Val 
115 120 125 

Pro lie Gly His Glu Lys Pro Ser Asn lie Lys Val Phe lie His Glu 
130 135 140 

Leu Asn Ala Gly Asn Gin Leu Ser His Met Ser Pro He Tyr Thr He 
145 150 155 160 

Glu Met Gly Asp Glu Leu Leu Ala Lys Leu Ala Arg Asp Ala Thr Phe 
165 170 175 

Phe Val Arg Ala His Glu Ser Asn Glu Met Gin Pro Thr Leu Ala He 
180 185 190 

Ser His Ala Gly Val Ser Val Val Met Ala Gin Thr Gin Pro Arg Arg 
195 200 205 

Glu Lys Arg Trp Ser Glu Trp Ala Ser Gly Lys Val Leu Cys Leu Leu 
210 215 220 

Asp Pro Leu Asp Gly Val Tyr Asn Tyr Leu Ala Gin Gin Arg Cys Asn 
225 230 235 240 

Leu Asp Asp Thr Trp Glu Gly Lys lie Tyr Arg Val Leu Ala Gly Asn 
245 250 255 
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Pro Ala Lys His Asp Leu Asp lie Lys Pro Thr Val He Ser His Arg 
260 265 270 

Leu His Phe Pro Glu Gly Gly Ser Leu Ala Ala Leu Thr Ala His Gin 
275 280 285 

Ala Cys His Leu Pro Leu Glu Thr Phe Thr Arg His Arg Gin Pro Arg 
290 295 300 

Gly Trp Glu Gin Leu Glu Gin Cys Gly Tyr Pro Val Gin Arg Leu Val 
305 310 315 320 

Ala Leu Tyr Leu Ala Ala Arg Leu Ser Trp Asn Gin Val Asp Gin Val 

325 330 335 

He Arg Asn Ala Leu Ala Ser Pro Gly Ser Gly Gly Asp Leu Gly Glu 
3^0 345 350 

Ala He Arg Glu Gin Pro Glu Gin Ala Arg Leu Ala Leu Thr Leu Ala 
355 360 365 

Ala Ala Glu Ser Glu Arg Phe Val Arg Gin Gly Thr Gly Asn Asp Glu 

370 375 380 

Ala Gly Ala Ala Asn Ala Asp v a l Val Ser Leu Thr Cys Pro Val Ala 
385 390 395 400- 

Ala Gly Glu Cys Ala Gly Pro Ala Asp Ser Gly Asp Ala Leu Leu Glu 
405 410 415 

Arg Asn Tyr Pro Thr Gly Ala Glu Phe Leu Gly Asp Gly Gly Asp Val 
420 425 430 

Ser Phe Ser Thr Arg Gly Thr Gin Asn Trp Thr Val Glu Arg Leu Leu 
435 440 445 

Gin Ala His Arg Gin Leu Glu Glu Arg Gly Tyr Val Phe Val Gly Tyr 
450 455 460 

His Gly Thr Phe Leu Glu Ala Ala Gin Ser He Val Phe Gly Gly Val 
465 4 ?0 475 480 

Arg Ala Arg Ser Gin Asp Leu Asp Ala He Trp Arg Gly Phe Tyr He 
485 490 495 



Ala Gly Asp Pro Ala Leu Ala Tyr Gly Tyr Ala Gin Asp Gin Glu Pro 
500 505 510 
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Asp Ala Arg Gly Arc He Arg Asn Gly Ala Leo Leu Arg Val Tyr V a l 

515 520 525 

Pro Arg Ser Ser Leu Pro Gly Phe Tyr Arg Thr Ser Leu Thr Leu Ala 
530 535 540 



10 



Ala Pro Glu Ala Ala Gly Glu Val Glu Arg Leu He Gly His Pro Leu 
545 550 555 560 



Pro Leu Arg Leu Asp Ala He Thr Gly Pro Glu Glu Glu Gly Gly Arg 
565 570 575 



15 



Leu Glu Thr He Leu Gly Trp Pro Leu Ala Glu Arg Thr Val Val He 
580 585 _ 590 



20 



Pro Ser Ala He Pro Thr Asp Pro Arg Asn Val Gly Gly Asp Leu Asp 
595 600 605 

Pro Ser Ser lie Pro Asp Lys Glu Gin Ala He Ser Ala Leu Pro Asp 
610 615 620 



25 



Tyr Ala Ser Gin Pro Gly Lys Pro Pro Arg Glu Asp Pro Leu Ala Pro 
625 630 635 640 

Gly He Pro Ser Gly Pro Leu Lys Ala Glu He Ala Gin Arg Leu Glu 
645 650 655 



30 



Asp Val Phe Ala Gly Lys Asn Thr Asp Leu Glu Val Leu Met Glu Trp 
660 665 670 



Leu Lys Thr Arg Pro He Leu Ser Pro Leu Thr Lys Gly He Leu Gly 
675 680 685 



35 



Phe Val Phe Thr Leu Thr Val Pro Ser Glu Arg Gly Leu Gin Arg Arg 
690 695 700 



40 



Arg Phe Val Gin Asn Ala Leu Asn Gly Asn Gly Asp Pro Asn Asn Met 
705 710 



715 



720 



Asp Lys Ala Val Lys Leu Tyr Arg Lys Leu Lys Arg Glu Pro Gly Lys 
725 730 735 



45 



Pro Pro Arg Glu Asp Leu Lys Xaa Glu Phe 
740 745 



50 



55 
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(2) INFORMATION FOR SEQ ID NO:30: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 14 base pairs 
5 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
(0) TOPOLOGY: linear 

(ii) MOLECULE TYPE: ONA (genomic) 

10 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 
15 CTAGACTAGT CTAG 14 

(2) INFORMATION FOR SEQ ID N0:31: 

SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
(0) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(i) 

20 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:31: 

30 

GGCGGCAGAA AGAGC 15 

y 

(2) INFORMATION FOR SEQ 10 N0:32: 

(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 21 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

40 (ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:32: 

45 

Met Lys Ala Asn Leu Leu Val Leu Leu Cys Ala Leu Ala Ala Ala Asp 
15 10 15 



50 



55 



48 



EP 0 541 335 A1 



Ala Asp Thr lie Cys 
20 

(2) INFORMATION FOR SEQ 10 N0:33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 72 base pairs 

(B) TYPE: nucleic acid 

(C) STRANOEONESS: single 
(0) TOPOLOGY: linear 

(ii) MOLECULE TYPE: ONA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ IDN0:33: — 
GGCAGAAAGA TGAAGGCAAA CCTACTGGTC CTGTTATGTG CACTTGCAGC TGCAGATGCA 
GACACAATAT GC 

(2) INFORMATION FOR SEQ 10 NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:34: 

Gly Arg Lys Met Lys Ala Asn Leu Leu Val Leu Leu Cys Ala Leu Ala 
1 5 10 15 

Ala Ala Asp Ala Asp Thr He Cys 
20 

(2) INFORMATION FOR SEQ 10 N0:35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 63 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: ONA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:35: 
ATGAAGGCAA ACCTACTGGT CCTGTTATGT GCACTTGCAG CTGCAGATGC AGACACAATA 
TGA 

(2) INFORMATION FOR SEQ 10 NO: 36: 

(i) SEQUENCE CHARACTER rST ICS : 

(A) LENGTH: 21 amino acids 

(B) TYPE: amino acid 

(C) STRANOEDNESS: single 
(0) TOPOLOGY; linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:36: 

Met Lys Ala Asn Leu Leu Val Leu Leu Cys Ala Leu Ala Ala Ala Asp 
1 5 10 15 

Ala Asp Thr He Xaa 
20 

(2) INFORMATION FOR SEQ ID NO: 37: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 amino acids 

(B) TYPE: amino acid 

(C) STRANDEONESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:37: 

His His Ala Asn Glu Asn He Phe Tyr Cys Pro He Ala He Met Ser 
1 5 10 15 

Ala Leu Ala Met Val Tyr Leu Gly Ala Lys Asp 
20 25 
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10 



(2) INFORMATION FOR SEQ ID NO:38: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 81 base pairs 

(B) TYPE: nucleic acid 

(C) STRANOEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: ONA (genomic) 



<xf) SEQUENCE DESCRIPTION: SEQ 10 N0:38: 

15 

CACCATGCCA ATGAGAACAT CTTCTACTGC CCCATTGCCA TCATGTCAGC TCTAGCCATG 60 
GTATACCTGG GTGCAAAAAG C 81 
20 (2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 amino acids 

(B) TYPE: amino acid 

25 (C) STRANOEDNESS: single 

(0) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

30 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:39: 

His His Ala Asn Gl u Asn He Phe Tyr Cys Pro lie Ala He Met Ser 
35 1 5 10 15 

Ala Leu Ala Het Val Tyr Leu Gly Ala Lys Ser 
20 25 

40 (2) INFORMATION FOR SEQ 10 N0:40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 78 base pairs 

(B) TYPE: nucleic acid 
45 (C) STRANOEDNESS: single 

<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

50 
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(xi) SEQUENCE DESCRIPTION: SEQ 10 N0:40: 
GGCAGAAAGA TGAAGGCAAA CCTACTGGTC CTGTTATGTG CACTTGCAGC TGCAGATGCA 
GACACAATAT GCATGATG 
(2) INFORMATION FOR SEQ 10 N0:41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 amino acids 

(B) TYPE: amino acid 

(C) STRANOEDNESS: single 
(0) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:41: 

Gly Arg Lys Met Lys Ala Asn Leu Leu Val Leu Leu Cys Ala Leu Al 
15 10 15 

Ala Ala Asp Ala Asp Thr He Cys Met Met 
20 25 

(2) INFORMATION FOR SEQ 10 N0:*2: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 72 base pairs 

(B) TYPE: nucleic acid 

(C) STRANQEONESS: single 
(0) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ 10 N0:42: 
GGCATGAAGG CAAACCTACT GGKCTGTTA TGTGCACTTG CAGCTGCAGA TGCAGACACA 
ATATGCATGA TG 
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(2) INFORMATION FOR SEQ 10 N0:43: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 amino acids 

(B) TYPE: amino acid 

(C) STRANOEDNESS: single 
(0) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:43: 
15 * 

Gly Met Lys Ala Asn Leu Leu Val Leu Leu Cys- Ala Leu Ala Ala Ala 
15 10 15 

Asp Ala Asp Thr He Cys Met Met 
20 20 

(2) INFORMATION FOR SEQ 10 N0:4d: 

(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 90 base pairs 

(B) TYPE: nucleic acid 

(C) STRANOEDNESS: single 
(0) TOPOLOGY: linear 

30 (ii) MOLECULE TYPE: ONA (genomic) 



35 



40 



45 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:44: 
GTATGCATGC ACCATGCCAA TGAGAACATC TTCTACTGCC CCATTGCCAT CATGTCAGCT 60 
CTAGCCATGG TATACCTGGG TGCAAAAGAC 90 
(2) INFORMATION FOR SEQ 10 NO: 45: 

<i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(H) MOLECULE TYPE: peptide 



10 



15 



20 



(xi) SEQUENCE DESCRIPTION: SEQ 10 N0:45: 

Val Cys Met His His Ala Asn Glu Asn lie Phe Tyr Cys Pro He Ala 
15 10 is 

He Met Ser Ala Leu Ala Met Val Tyr Leu Gly Ala Lys Asp 
20 25 30 

(2) INFORMATION FOR SEQ ID N0:46: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 147 base pairs 
<B) TYPE: nucleic acid 

(C) STRANOEONCSS: single 

(D) TOPOLOGY: linear 

<ii> MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 

ATGAAGGCAA ACCTACTGGT CCTGTTATGT GCACTTGCAG CTGCAGATGC AGACACAATA 

TGCCACCATG CCAATGAGAA CATCTTCTAC TGCCCCATTG CCATCATGTC AGCTCTAGCC 

ATGGTATACC TGGGTGCAAA AGACAGC 

(2) INFORMATION FOR SEQ ID NO: 47: 

35 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 49 amino acids 
(0) TYPE: amino acid 

(C) STRANOEDNESS: single 

(D) TOPOLOGY: linear 

40 

(H) MOLECULE TYPE: peptide 
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(xi) SEQUENCE DESCRIPTION: SEQ ID N0:47: 

Met Lys Ala Asn Leu Leu Val Leu Leu Cys Ala Leu Ala Ala Ala Asp 
15 10 15 

Ala Asp Thr lie Cys His His Ala Asn Glu Asn lie Phe Tyr Cys Pro 
20 25 30 

lie Ala He Met Ser Ala Leu Ala Met Val Tyr Leu Gly Ala Lys Asp 
35 40 45 



(2) INFORMATION FOR SEQ ID N0:48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 70 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEONESS: single 

(D) TOPOLOGY: linear 

(H) MOLECULE TYPE: ONA (genomic) 



(xi) SEQUENCE DESCRIPTION: SE0 ID NO:48: 
CCTATCAGAA ACGAATGGGG GTGCAGATGC AACGGTTCAA GCGCGAGGAC CTGAAGTAAG 
AATTCGAGCT 

(2) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2013 base pairs 
(8) TYPE: nucleic acid 
(C) STRANDEONESS: single 
(0) TOPOLOGY: linear 

<ii> MOLECULE TYPE: DNA (genomic) 
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10 



15 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:49: 

ATGGCCGAGG AAGCTTTCGA CCTCTGGAAC GAATGCGCCA AAGCCTGCGT GCTCGACCTC 60 

AAGGACGGCG TGCGTTCCAG CCGCATGAGC GTCGACCCGG CCATCGCCGA CACCAACGGC 120 

CAGGGCGTGC TGCACTACTC CATGGTCCTG GAGGGCGGCA ACGACGCGCT CAAGCTGGCC 180 

ATCGACAACG CCCTCAGCAT CACCAGCGAC GGCCTGACCA TCCGCCTCGA AGGCGGCGTC 240 

GAGCCGAACA AGCCGGTGCG CTACAGCTAC ACGCGCCAGG CGCGCGGCAG TTGGTCGCTG 300 

AACTGGCTGG TACCGATCGG CCACGAGAAG CCCTCGAACA TCAAGGTGTT CATCCACGAA 360 

CTGAACGCCG GCAACCAGCT. CAGCCACATG TCGCCGATCT ACACCATCGA GATGGGCGAC 420 

GAGTTGCTGG CGAAGCTGGC GCGCGATGCC ACCTTCTTCG TCAGGGCGCA CGAGAGCAAC 480 

20 GAGATGCAGC CGACGCTCGC CATCAGCCAT GCCGGGGTCA GCGTGGTCAT GGCCCAGACC 540 

CAGCCGCGCC GGGAAAAGCG CTGGAGCGAA TGGGCCAGCG GCAAGGTGTT GTGCCTGCTC 600 

GACCCGCTGG ACGGGGTCTA CAACTACCTC GCCCAGCAAC GCTGCAACCT CGACGATACC 660 

25 

TGGGAAGGCA AGATCTACCG GGTGCTCGCC GGCAACCCGG CGAAGCATGA CCTGGACATC 720 

AAACCCACGG TCATCAGTCA TCGCCTGCAC ITTCCCGAGG GCGGCAGCCT GGCCGCGCTG 780 

30 ACCGCGCACC AGGCTTGCCA CCTGCCGCTG GAGACTTTCA CCCGTCATCG CCAGCCGCGC 840 

GGCTGGGAAC AACTGGAGCA GTGCGGCTAT CCGGTGCAGC GGCTGGTCGC CCTCTACCTG 900 

GCGGCGCGGC TGTCGTGGAA CCAGGTCGAC CAGGTGATCC GCAACGCCCT GGCCAGCCCC 960 

35 

GGCAGCGGCG GCGACCTGGG CGAAGCGATC CGCGAGCAGC CGGAGCAGGC CCGTCTGGCC 1020 

CTGACCCTGG CCGCCGCCGA GAGCGAGCGC TTCGTCCGGC AGGGCACCGG CAACGACGAG 1080 

^ GCCGGCGCGG CCAACGCCGA CGTGGTGAGC CTGACCTGCC CGGTCGCCGC CGGTGAATGC 1140 

GCGGGCCCGG CGGACAGCGG CGACGCCCTG CTGGAGCGCA ACTATCCCAC TGGCGCGGAG 1200 

TTCCTCGGCG ACGGCGGCGA CGTCAGCTTC AGCACCCGCG GCAGTCTTCT AACCGAGGTC 1260 

45 

GAAACGTACG TTCTCTCTAT CATCCCGTCA GGCCCCCTCA AAGCCGAGAT CGCACAGAGA 1320 

CTTGAAGATG TCTTTGCAGG GAAGAACACC GATCTTGAGG TTCTCATGGA ATGGCTAAAG 1380 

50 
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ACAAGACCAA TCCTGTCACC TCTGACTAAG GGGATTTTAG GATTTGTGTT CACGCTCACC 1440 

GTGCCCAGTG AGCGAGGACT GCAGCGTAGA CGCTTTGTCC AAAATGCCCT TAATGGGAAC 1500 

5 

GGGGATCCAA ATAACATGGA CAAAGCAGTT AAACTGTATA GGAAGCTCAA GAGGGAGATA 1560 

ACATTCCATG GGGCCAAAGA AATCTCACTC AGTTATTCTG CTGGTGCACT TGCCAGTTGT 1620 

10 ATGGGCCTCA TATACAACAG GATGGGGGCT GTGACCACTG AAGTGGCATT TGGCCTGGTA 1680 

TGTGCAACCT GTGAACAGAT TGCTGACTCC CAGCATCGGT CTCATAGGCA AATGGTGACA 1740 

ACAACCAACC CACTAATCAG ACATGAGAAC AGAATGGTTT TAGCCAGCAC TACAGCTAAG 1800. 

15 

GCTATGGAGC AAATGGCTGG ATCGAGTGAG CAAGCAGCAG AG£CCATGGA GGTTGCTAGT 1860 

CAGGCTAGGC AAATGGTGCA AGCGATGAGA ACCATTGGGA CTCATCCTAG CTCCAGTGCT 1920 

20 GGTCTGAAAA ATGATCTTCT TGAAAATTTG CAGGCCTAT.C AGAAACGAAT GGGGGTGCAG 1980 

ATGCAACGGT TCAAGCGCGA GGACCTGAAG TAA 2013 
(2) INFORMATION FOR SEQ ID N0:50: 



25 



30 



35 



40 



45 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 671 amino acids 
(8) TYPE: amino acid 
(C) STRANDEONESS: single 
(0) TOPOLOGY: linear 

<ii> MOLECULE TYPE: peptide 



<xi) SEQUENCE DESCRIPTION: SEQ 10 N0:50: 

Met Ala Gl u Glu Ala Phe Asp Leu Trp Asn Glu Cys Ala Lys Ala Cys 
15 10 15 

Val Leu Asp Leu Lys Asp Gly Val Arg Ser Ser Arg Met Ser Val Asp 
20 25 30 

Pro Ala He Ala Asp Thr Asn Gly Gin Gly Val Leu His Tyr Ser Met 
35 40 45 

Val Leu Glu Gly Gly Asn Asp Ala Leu Lys Leu Ala He Asp Asn Ala 
50 55 60 
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Leu Ser He Thr Ser Asp Gly Leu Thr He Arg Leu Glu Gly Gly Val 
65 70 75 80 

Glu Pro Asn Lys Pro Val Arg Tyr Ser Tyr Thr Arg Gin Ala Arg Gly 
85 90 95 

Ser Trp Ser Leu Asn Trp Leu Val Pro He Gly His Glu Lys Pro Ser 
100 105 110 

Asn He Lys Val Phe He His Glu Leu Asn Ala Gly Asn Gin Leu Ser 
115 120 125 

His Met Ser Pro He Tyr Thr He Glu Met Gly Asp Glu Leu Leu Ala 
130 135 140 

Lys Leu Ala Arg Asp Ala Thr Phe Phe Val Arg Ala His Glu Ser Asn 
!45 150 155 160 

Glu Met Gin Pro Thr Leu Ala He Ser His Ala Gly Val Ser Val Val 
165 170 175 

Met Ala Gin Thr Gin Pro Arq Arg Glu Lys Arg Trp Ser Glu Trp Ala 
1P0 185 190 

Ser Gly Lys Val Leu Cys Leu Leu Asp Pro Leu Asp Gly Val Tyr Asn 
195 200 205 

Tyr Leu Ala Gin Gin Arg Cys Asn Leu Asp Asp Thr Trp GVu Gly Lys 
210 215 220 

He Tyr Arg Val Leu Ala Gly Asn Pro Ala Lys His Asp Leu Asp He 
225 230 235 240 

Lys Pro Thr Val He Ser His Arg Leu His Phe Pro Glu Gly Gly Ser 
245 250 255 

Leu Ala Ala Leu Thr Ala His Gin Ala Cys His Leu Pro Leu Glu Thr 
260 265 270 

Phe Thr Arg His Arg Gin pro Arg Gly Trp Glu Gin Leu Glu Gin Cys 
275 280 285 

Gly Tyr Pro Val Gin Arg Leu Val Ala Leu Tyr Leu Ala Ala Arg Leu 
290 295 300 



Ser Trp Asn Gin Val Asp Gin Val He Arg Asn Ala Leu Ala Ser Pro 
305 310 315 320 
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Gly Ser Gl y Gly Asp Leu Gly Glu Ala He Arg Glu Gin Pro Glu Gin 
325 330 335 

A>a Arg Leu Ala Leu Thr Leu Ala Ala Ala Glu Ser Glu Arg Phe Val 
5 3 40 345 350 

Arg Gin Gly Thr Gly Asn Asp Glu Ala Gly Ala Ala Asn Ala Asp Val 
355 360 365 

10 Val Ser Leu Thr C ys Pro Val Ala Ala Gly Glu Cys Ala Gly Pro Ala 

370 375 380 

Asp Ser Gly Asp Ala Leu Leu Glu Arg Asn Tyr Pro Thr Gly Ala Glu 
385 390 395 400 

15 

Phe Leu Gly Asp Gly Gly Asp Val Ser Phe Ser Thr Arg Gly Ser Leu 
405 410 415 

Leu Thr Glu Val Glu Thr Tyr Val Leu Ser He He Pro Ser Gly Pro 
20 420 425 «0 

Leu Lys Ala Glo He Ala Gin Arg Leu Glu Asp Val Phe Ala Gly Lys 
435 440 445 



25 



30 



35 



40 



45 



Asn Thr Asp Leu Glu Val Leu Met Glu Trp Leu Lys Thr Arg Pro He 
450 455 460 

Leu Ser Pro Leu Thr Lys Gly He Leu Gly Phe Val Phe Thr Leu Thr 
465 470 475 480 

Val Pro Ser Glu Arg Gly Leu Gin Arg Arg Arg Phe Val Gin Asn Ala 
485 490 495 

Leu Asn Gly Asn Gly Asp Pro Asn Asn Met Asp Lys Ala Val Lys Leu 
500 505 510 

Tyr Arg Lys Leu Lys Arg Glu He Thr Phe His Gly Ala Lys Glu He 
515 520 525 

Ser Leu Ser Tyr Ser Ala Gly Ala Leu Ala Ser Cys Met Gly Leu He 
530 535 540 

Tyr Asn Arg Met Gly Ala Val Thr Thr Glu Val Ala Phe Gly Leu Val 
545 550 555 560 

Cys Ala Thr Cys Glu Gin lie Ala Asp Ser Gin His Arg* Ser His Arg 
565 570 575 
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Gin Met Val Thr Thr Thr Asn Pro Leu lie Arg His Glu Asn Arg Met 

580 585 590 

Val Leu Ala Ser Thr Thr Ala Lys Ala Met Glu Gin Met Ala Gly Ser 

595 600 605 

Ser Glu Gin Ala Ala Glu Ala Met Glu Val Ala Ser Gin Ala Arg Gin 
610 615 620 

Met Val Gin Ala Het Arg Thr lie Gly Thr His Pro Ser Ser Ser Ala 
625 630 635 640 

Gly Leu Lys Asn Asp Leu Leu Glu Asn Leu Gin Ala Tyr Gin Lys Arg 

645 650 655 

Met Gly Val Gin Met Gin Arg Phe Lys Arg Glu Asp Leu Lys Xaa 

660 665 670 



(2) INFORMATION FOR SEQ ID N0:51: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 38 base pairs 
<B) TYPE: nucleic acid 
(C) STRANDEONESS: single 
(0) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi f SEQUENCE DESCRIPTION: SEQ 10 N0:51: 

ATACCCGCGG CATGGCGTCC CAAGGCACCA AACGGTCT 

(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 81 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEONESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID N0:52: 
ATAGAATTCT TACTTCAGGT CCTCGCGATT GTCGTACTCC TCTGCATTGT CTCCGAAGAA 60 

5 

ATAAGATCCT TCATTACTCA T 81 

(2) INFORMATION FOR SEQ ID N0:53: 

10 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2754 base pairs 
(6) TYPE: nucleic acid 
(C) STRANDEONESS: single 
(0) TOPOLOGY: linear 

15 

Hi) MOLECULE TYPE: ONA (genomic) 



20 (xi) SEQUENCE DESCRIPTION: S 

ATGGCCGAGG AAGCTTTCGA CCTCTGGAAC 
AAGGACGGCG TGCGTTCCAG CCGCATGAGC 

25 

CAGGGCGTGC TGCACTACTC CATGGTCCTG 
ATCGACAACG CCCTCAGCAT CACCAGCGAC 
30 GAGCCGAACA AGCCGGTGCG CTACAGCTAC 

AACTGGCTGG TACCGATCGG CCACGAGAAG 
CTGAACGCCG GCAACCAGCT CAGCCACATG 

35 

GAGTTGCTGG CGAAGCTGGC GCGCGATGCC 
GAGATGCAGC CGACGCTCGC CATCAGCCAT 
CAGCCGCGCC GGGAAAAGCG CTGGAGCGAA 

40 

GACCCGCTGG ACGGGGTCTA CAACTACCTC 
TGGGAAGGCA AGATCTACCG GGTGCTCGCC 
45 AAACCCACGG TCATCAGTCA TCGCCTGCAC 

ACCGCGCACC AGGCTTGCCA CCTGCCGCTG 



EQ ID NO: 53: 

GAATGCGCCA AAGCCTGCGT GCTCGACCTC 60 

GTCGACCCGG CCATCGCCGA CACCAACGGC 120 

GAGGGCGGCA ACGACGCGCT CAAGCTGGCC 180 

GGCCTGACCA TCCGCCTCGA AGGCGGCGTC 240 

ACGCGCCAGG CGCGCGGCAG TTGGTCGCTG 300 

CCCTCGAACA TCAAGGTGTT CATCCACGAA 360 

TCGCCGATCT ACACCATCGA GATGGGCGAC 420 

ACCTTCTTCG TCAGGGCGCA CGAGAGCAAC 480 

GCCGGGGTCA GCGTGGTCAT GGCCCAGACC 540 

TGGGCCAGCG GCAAGGTGTT GTGCCTGCTC 600 

GCCCAGCAAC GCTGCAACCT CGACGATACC 660 

GGCAACCCGG CGAAGCATGA CCTGGACATC 720 

TTTCCCGAGG GCGGCAGCCT GGCCGCGCTG 780 

GAGACTTTCA CCCGTCATCG CCAGCCGCGC 840 
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GGCTGGGAAC AACTGGAGCA GTGCGGCTAT 
GCGGCGCGGC TGTCGTGGAA CCAGGTCGAC 
GGCAGCGGCG GCGACCTGGG CGAAGCGATC 
CTGACCCTGG CCGCCGCCGA GAGCGAGCGC 
GCCGGCGCGG CCAACGCCGA CGTGGTGAGC 
GCGGGCCCGG CGGACAGCGG CGACGCCCTG 
TTCCTCGGCG ACGGCGGCGA CGTCAGCTTC 
AAACGGTCTT ACGAACAGAT GGAGACTGAT 
GCATCCGTCG GAAAAATGAT TGGTGGAATT 
CTTAAACTCA GTGATTATGA GGGACGGTTG 
GTGCTCTCTG CTTTTGACGA AAGGAGAAAT 
AAGGATCCTA AGAAAACTGG AGGACCTATA 
GAACTCATCC TTTATGACAA AGAAGAAATA 
GACGATGCAA CGGCTGGTCT GACTCACATG 
ACTTATCAGA GGACAAGGGC TCTTGTTCGC 
ATGCAAGGTT CAACTCTCCC TAGGAGGTCT 
GGAACAATGG TGATGGAATT GGTCAGGATG 
TGGAGGGGTG AGAATGGACG AAAAACAAGA 
AAAGGGAAAT TTCAAACTGC TGCACAAAAA 
GACCCAGGGA ATGCTGAGTT CGAAGATCTC 
AGAGGGTCGG TTGCTCACAA GTCCTGCCTG 
AGTGGGTACG ACTTTGAAAG AGAGGGATAC 
CTTCAAAACA GCCAAGTGTA CAGCCTAATC 
CAACTGGTGT GGATGGCATG CCATTCTGCC 



CCGGTGCAGC GGCTGGTCGC CCTCTACCTG 
CAGGTGATCC GCAACGCCCT GGCCAGCCCC 
CGCGAGCAGC CGGAGCAGGC CCGTCTGGCC 
TTCGTCCGGC. AGGGCACCGG CAACGACGAG 
CTGACCTGCC CGGTCGCCGC CGGTGAATGC 
CTGGAGCGCA ACTATCCCAC TGGCGCGGAG 
AGCACCCGCG GCATGGCGTC CCAAGGCACC 
GGAGAACGCC AGAATGCCAC TGAAATCAGA 
GGACGATTCT ACATCCAAAT GTGCACAGAA 
ATCCAAAACA GCTTAACAAT AGAGAGAATG 
AAATACCTGG AAGAACATCC CAGTGCGGGG 
TACAGAAGAG TAAACGGAAA GTGGATGAGA 
AGGCGAATCT GGCGCCAAGC TAATAATGGT 
ATGATCTGGC ATTCCAATTT GAATGATGCA 
ACCGGAATGG ATCCCAGGAT GTGCTCTCTG 
GGAGCCGCAG GTGCTGCAGT CAAAGGAGTT 
ATCAAACGTG GGATCAATGA TCGGAACTTC 
ATTGCTTATG AAAGAATGTG CAACATTCTC 
GCAATGATGG ATCAAGTGAG A6AGAGCCGG 
ACTTTTCTAG CACGGTCTGC ACTCATATTG 
CCTGCCTGTG TGTATGGACC TGCCGTAGCC 
TCTCTAGTCG GAATAGACCC TTTCAGACTG 
AGACCAAATG AGAATCCAGC ACACAAGAGT 
GCATTTGAAG ATCTAAGAGT ATTGAGCTTC 
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ATCAAAGGGA CGAAGGTGGT CCCAAGAGGG AAGCTTTCCA CTAGAGGAGT TCAAATTGCT 2340 

TCCAATGAAA ATATGGAGAC TATGGAATCA AGTACACTTG AACTGAGAAG CAGGTACTGG 2400 

GCCATAAGGA CCAGAAGTGG AGGAAACACC AATCAACAGA GGGCATCTGC GGGCCAAATC 2460 

AGCATACAAC CTACGTTCTC AGTACAGAGA AATCTCCCTT TTGACAGAAC AACCGTTATG 2520 

GCAGCATTCA CTGGGAATAC AGAGGGGAGA ACATCTGACA TGAGGACCGA AATCATAAGG 2580 

ATGATGGAAA GTGCAAGACC AGAAGATGTG TCTTTCCAGG GGCGGGGAGT CTTCGAGCTC 2640 

TCGGACGAAA AGGCAGCGAG CCCGATCGTG CCTTCCTTTG ACATGAGTAA TGAAGGATCT 2700 

TATTTCTTCG GAGACAATGC AGAGGAGTAC GACAATCGCG AGGACCTGAA GTAA 2754 
(2) INFORMATION FOR SEQ ID NO: 54: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 918 amino acids 

(B) TYPE: amino acid 

(C) STRANOEONESS: single 
(0) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ 10 N0:54: 

Met Ala Glu Glu Ala Phe Asp Leu Trp Asn Glu Cys Ala Lys Ala Cys 
15 10 15 

Val Leu Asp Leu Lys Asp Gly Val Arg Ser Ser Arg Met Ser Val Asp 
20 25 30 

Pro Ala lie Ala Asp Thr Asn Gly Gin Gly Val Leu His Tyr Ser Met 
35 40 45 

Val Leu Glu Gly Gly Asn Asp Ala Leu Lys Leu Ala He Asp Asn Ala 
50 55 60 

Leu Ser He Thr Ser Asp Gly Leu Thr He Arg Leu Glu Gly Gly Val 
65 70 75 80 

Glu Pro Asn Lys Pro Val Arg Tyr Ser Tyr Thr Arg Gin Ala Arg Gly 
85 90 95 
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Ser Trp Ser Leu Asn Trp Leu Val Pro lie Gly His Glu Lys Pro Ser 
100 105 110 

Asn lie Lys Val Phe lie His Glu Leu Asn Ala Gly Asn Gin Leu Ser 
115 120 125 

His Het Ser Pro lie Tyr Thr lie Glu Met Gly Asp Glu Leu Leu Ala 
130 135 140 

Lys Leu Ala Arg Asp Ala Thr Phe Phe Val Arg Ala His Glu Ser Asn 
145 150 155 160 

Glu Met Gin Pro Thr Leu Ala lie Ser His Ala Gly Val Ser Val V a l 
165 170 175 

Met Ala Gin Thr Gin Pro Arg Arg Glu Lys Arg'Trp Ser Glu Trp Ala 
180 185 190 

Ser Gly Lys Val Leu Cys Leu Leu Asp Pro Leu Asp Gly Val Tyr Asn 
195 200 205 

Tyr Leu Ala Gin Gin Arg Cys Asn Leu Asp Asp Thr Trp Glu Gly Lys 
210 215 220 

lie Tyr Arg Val Leu Ala Gly Asn Pro Ala Lys His Asp Leu Asp lie 
225 230 235 240. 

Lys Pro Thr Val He Ser His Arg Leu His Phe Pro Glu Gly Gly Ser 
245 250 255 

Leu Ala Ala Leu Thr Ala His Gin Ala Cys His Leu Pro Leu Glu Thr 
260 265 270 

Phe Thr Arg His Arg Gin Pro Arg Gly Trp Glu Gin Leu Glu Gin Cys 
275 280 285 

Gly Tyr Pro Val Gin Arg Leu Val Ala Leu Tyr Leu Ala Ala Arg Leu 
290 295 300 

Ser Trp Asn Gin Val Asp Gin Val He Arg Asn Ala Leu Ala Ser Pro 
305 310 315 320 

Gly Ser Gly Gly Asp Leu Gly Glu Ala lie Arg Glu Gin Pro Glu Gin 
325 330 335 



Ala Arg Leu Ala Leu Thr Leu Ala Ala Ala Glu Ser Glu Arg Phe Val 
340 345 350 
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Arg Gin Gly Thr Gly Asn Asp Glu Ala Gly Ala Ala Asn Ala Asp Val 

355 360 365 

Val Ser Leu Thr Cys Pro Val Ala Ala Gly Glu Cys Ala Gly Pro Ala 

370 375 380 

Asp Ser Gly Asp Ala Leu Leu Glu Arg Asn Tyr Pro Thr Gly Ala Glu 

385 390 395 400 

Phe Leu Gly Asp Gly Gly Asp Val Ser Phe Ser Thr Arg Gly Met Ala 

405 410 415 

Ser Gin Gly Thr Lys Arg Ser Tyr Glu Gin Met Glu Thr Asp Gly Glu 

420 425 430 

Arg Gin Asn Ala Thr Glu He Arg Ala Ser Val Gly Lys Met He Gly 

435 440 445 



20 



Gly lie Gly Arg Phe Tyr He Gin Met Cys Thr Glu Leu Lys Leu Ser 
450 455 460 



25 



Asr- Tvr Glu Gly Arg Leu lie Gin Asn Ser Leu Thr He Glu Arg Het 

4(^5 470 475 480 

Val Leu Ser Ala Phe Asp Glu Arg Arg Asn Lys Tyr Leu Glu Glu His 

485 490 495 



30 



Pro Ser Ala Gly Lys Asp Pro Lys Lys Thr Gly Gly Pro lie Tyr Arg 
500 505 510 

Arg Val A^n Gly Lys Trp Met Arg Glu Leu He Leu Tyr Asp Lys Glu 
515 520 525 



35 



Glu He Arg Arg He Trp Arg Gin Ala Asn Asn Gly Asp Asp Ala Thr 
530 535 540 



Ala Gly Leu Thr His Met Met He Trp His Ser Asn Leu Asn Asp Ala 
545 550 555 560 



40 



Thr Tyr Gin Arg Thr Arg Ala Leu Val Arg Thr Gly Met Asp Pro Arg 
565 570 575 



45 



Met Cys Ser Leu Met Gin Gly Ser Thr Leu Pro Arg Arg Ser Gly Ala 
580 585 590 

Ala Gly Ala Ala Val Lys Gly Val Gly Thr Met Val Met Glu Leu Val 

595 600 605 
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Arg Met He Lys ArgGly He Asn Asp Arg Asn Phe Trp Arg Gly Glu 
610 615 620 

Asn Gly Arg Lys Thr Arg He Ala Tyr Glu Arg Met Cys Asn lie Leu 
625 630 635 640 

Lys Gly Lys Phe Gin Thr Ala Ala Gin Lys Ala Met Met Asp Gin Val 
645 650 655 

Arg Glu Ser Arg Asp Pro Gly Asn Ala Glu Phe Glu Asp Leu Thr Phe 
660 665 670 



15 



Leu Ala Arg Ser Ala Leu lie Leu Arg Gly Ser Val Ala His Lys Ser 
675 680 685 



Cys Leu Pro Ala Cys Val Tyr Gly Pro Ala Val ^la Ser Gly Tyr Asp 
690 695 700 



20 



Phe Glu Arq Glu Gly Tyr Ser Leu Val Gly He Asp Pro Phe Arg Leu 
705 710 715 720 



25 



Leu Gin Asn Ser Gin Val Tyr Ser Leu He Arg Pro Asn Glu Asn Pro 

725 730 735 

Ala His Lys Ser Gin Leu Val Trp Met Ala Cys His Ser Ala Ala Phe 

740 745 750 



30 



Glu Asp Leu Arg Val Leu Ser Phe He Lys Gly Thr Lys Val Val Pro 

755 760 765 

Arg Gly Lys Leu Ser Thr Arg Gly Val Gin lie Ala Ser Asn Glu Asn 

'770 775 780 



35 



Met Glu Thr Met Glu Ser Ser Thr Leu Glu Leu Arg Ser Arg Tyr Trp 
785 790 795 800 



Ala He Arg Thr Arg Ser Gly Gly Asn Thr Asn Gin Gin Arg Ala Ser 
805 810 815 



40 



Ala Gly Gin He Ser He Gin Pro Thr Phe Ser Val Gin Arg Asn Leu 
820 825 830 



45 



Pro Phe Asp Arg Thr Thr Val Met Ala Ala Phe Thr Gly Asn Thr Glu 
835 840 845 

Gly Arg Thr Ser Asp Met Arg Thr Glu lie lie Arg Met Met Glu Ser 
850 855 860 
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Ala Arg Pro Glu Asp Val Ser Phe Gin Gl y Arg Gly Val Phe Glu Leu 
865 870 875 880 

Ser Asp Glu Lys Ala Ala Ser Pro He Val Pro Ser Phe Asp Met Ser 
885 890 895 

Asn Glu Gly Ser Tyr Phe Phe Gly Asp Asn Ala Glu Glu Tyr Asp Asn 
900 905 910 

Arg Glu Asp Leu Lys Xaa 
915 

(2> INFORMATION FOR SEQ 10 NO:55: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 35 base pairs 
(8) TYPE: nucleic acid 
(C) STRANDEDNESS: single 
(0) TOPOLOGY: linear 

(ii) MOLECULE TYPE: ONA (genomic) 



25 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:55: 
ATACCCGCGG CATGGGTGCG AGAGCGTCGG TATAT 35 



(2) INFORMATION FOR SEQ ID N0:56: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 36 base pairs 
(8 ) TYPE: nucleic acid 
(C) STRANDEDNESS: single 
(0) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



40 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:56: 
ATAGAATTCT CATTGTGACG AGGGGTCGCT GCCAAA 36 

45 
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(2) INFORMATION FOR SEQ 10 NO: 57: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2814 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEONESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION; SEQ ID N0:57: 

ATGAAAAAGA CAGCTATCGC GATTGCAGTG GCACTGGCTG GTTTCGCTAC CGTAGCGCAG 60 

GCCGCGAATT TGGCCGAAGA AGCTTTCGAC CTCTGGAACG AATGCGCCAA AGCCTGCGTG 120 

CTCGACCTCA AGGACGGCGT GCGTTCCAGC CGCATGAGCG TCGACCCGGC CATCGCCGAC 180 

ACCAACGGCC AGGGCGTGCT GCACTACTCC ATGGTCCTGG AGGGCGGCAA CGACGCGCTC 240 

AAGCTGGCCA TCGACAACGC CCTCAGCATC ACCAGCGACG GCCTGACCAT CCGCCTCGAA 300 

25 GGCGGCGTCG AGCCGAACAA GCCGGTGCGC TACAGCTACA CGCGCCAGGC GCGCGGCAGT 360 

TGGTCGCTGA ACTGGCTGGT ACCGATCGGC CACGAGAAGC CCTCGAACAT CAAGGTGTTC 420 

ATCCACGAAC TGAACGCCGG CAACCAGCTC AGCCACATGT CGCCGATCTA CACCATCGAG 480 

30 

ATGGGCGACG AGTTGCTGGC GAAGCTGGCG CGCGATGCCA CCTTCTTCGT CAGGGCGCAC 540 

GAGAGCAACG AGATGCAGCC GACGCTCGCC ATCAGCCATG CCGGGGTCAG CGTGGTCATG 600 

35 GCCCAGACCC AGCCGCGCCG GGAAAAGCGC TGGAGCGAAT GGGCCAGCGG CAAGGTGTTG 660 

TGCCTGCTCG ACCCGCTGGA CGGGGTCTAC AACTACCTCG CCCAGCAACG CTGCAACCTC 720 

GACGATACCT GGGAAGGCAA GATCTACCGG GTGCTCGCCG GCAACCCGGC GAAGCATGAC 780 

CTGGACATCA AACCCACGGT CATCAGTCAT CGCCTGCACT TTCCCGAGGG CGGCAGCCTG 840 

GCCGCGCTGA CCGCGCACCA GGCTTGCCAC CTGCCGCTGG AGACTTTCAC CCGTCATCGC 900 

CAGCCGCGCG GCTGGGAACA ACTGGAGCAG TGCGGCTATC CGGTGCAGCG GCTGGTCGCC 960 

CTCTACCTGG CGGCGCGGCT GTCGTGGAAC CAGGTCGACC AGGTGATCCG CAACGCCCTG 1020 
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GCCAGCCCCG GCAGCGGCGG CGACCTGGGC 
CGTCTGGCCC TGACCCTGGC CGCCGCCGAG 
5 AACGACGAGG CCGGCGCGGC CAACGCCGAC 

GGTGAATGCG CGGGCCCGGC GGACAGCGGC 
GGCGCGGAGT TCCTCGGCGA CGGCGGCGAC 

10 

AGAGCGTCGG TATTAAGCGG GGGAGAATTA 
GGGGGAAAGA AACAATATAA ACTAAAACAT 
15 TTCGCAGTTA ATCCTGGCCT TTTAGAGACA 

CTACAACCAT CCCTTCAGAC AGGATCAGAA 
GTCCTCTATT GTGTGCATCA AAGGATAGAT 

20 

ATAGAGGAAG AGCAAAACAA AAGTAAGAAA 
AACAACAGCC AGGTCAGCXA AAATTACCCT 
25 CATCAGGCCA TATCACCTAG AACTTTAAAT 

TTCAGCCCAG AAGTAATACC CATGTTTTCA 
TTAAATACCA TGCTAAACAC AGTGGGGGGA 

30 

ACCATCAATG AGGAAGCTGC AGAATGGGAT 
GCACCAGGCC AGATGAGAGA ACCAAGGGGA 
35 CAGGAACAAA TAGGATGGAT GACACATAAT 

AGATGGATAA TCCTGGGATT AAATAAAATA 
GACATAAGAC AAGGACCAAA GGAACCCTTT 

40 

CTAAGAGCCG AGCAAGCTTC ACAAGAGGTA 
CAAAATGCGA ACCCAGATTG TAAGACTATT 
GAAGAAATGA TGACAGCATG TCAGGGAGTG 

45 

GCTGAAGCAA TGAGCCAAGT AACAAATCCA 



GAAGCGATCC GCGAGCAGCC GGAGCAGGCC 1080 

AGCGAGCGCT TCGTCCGGCA GGGCACCGGC 1140 

GTGGTGAGCC TGACCTGCCC GGTCGCCGCC 1200 

GACGCCCTGC TGGAGCGCAA CTATCCCACT 1260 

GTCAGCTTCA GCACCCGCGG CATGGGTGCG 1320 

GATAAATGGG AAAAAATTCG GTTAAGGCCA 1380 

ATAGTATGGG CAAGCAGGGA GCTAGAACGA 1440 

TCAGAAGGCT GTAGACAAAT ACTGGGACA6 1500 

GAACTTAGAT CATTATATAA TACAATAGCA 1560 

GTAAAAGACA CCAAGGAAGC CTTAGATAAG 1620 

AAGGCACAGC AAGCAGCAGC TGACACAGGA 1680 

ATAGTGCAGA ACCTCCAGGG GCAAATGGTA 1740 

GCATGGGTAA AAGTAGTAGA AGAGAAGGCT 1800 

GCATTATCAG AAGGAGCCAC CCCACAAGAT 1860 

CATCAAGCAG CCATGCAAAT GTTAAAAGAG 1920 

AGATTGCATC CAGTGCATGC AGGGCCTATT 1980 

AGTGACATAG CAGGAACTAC TAGTACCCTT 2040 

CCACCTATCC CAGTAGGA6A AATCTATAAA 2100 

GTAAGAATGT ATAGCCCTAC CAGCATTCTG 2160 

AGAGACTATG TAGACCGATT CTATAAAACT 2220 

AAAAATTGGA TGACAGAAAC CTTGTTGGTC 2280 

TTAAAAGCAT TGGGACCAGG AGCGACACTA 2340 

GGGGGACCCG GCCATAAAGC AAGAGTTTTG 2400 

GCTACCATAA TGATACAGAA AGGCAATTTT 2460 
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AGGAACCAAA GAAAGACTGT TAAGTGTTTC AATTGTGGCA AAGAAGGGCA CATAGCCAAA 2520 

AAITGCAGGG CCCCTAGGAA AAAGGGCTGT TGGAAATGTG GAAAGGAAGG ACACCAAATG 2580 

AAAGATTGTA CTGAGAGACA GGCTAATTTT TTAGGGAAGA TCTGGCCTTC CCACAAGGGA 2640 

AGGCCAGGGA ATTTTCTTCA GAGCAGACCA GAGCCAACAG CCCCACCAGA AGAGAGCTTC 2700 

AGGTTTGGGG AAGAGACAAC AACTCCCTCT CAGAAGCAGG AGCCGATAGA CAAGGAACTG 2760 

TATCCTTTAG CTTCCCTCAG ATCACTCTTT GGCAGCGACC CCTCGTCACA ATGA 2814 
(2) INFORMATION FOR SEQ 10 NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 938 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 



Met Lys Lys Thr Ala He Ala He Ala Val Ala Leu Ala Gly Phe Ala 
1 5 10 -15 

Thr Val Ala Gin Ala Ala Asn Leu Ala Glu Glu Ala Phe Asp Leu Trp 
20 25 30 

Asn Glu Cys Ala Lys Ala Cys Val Leu Asp Leu Lys Asp Gly Val Arg 
35 3 5 4 0 4 5 

Ser Ser Arg Met Ser Val Asp Pro Ala He Ala Asp Thr Asn Gly Gin 
50 55 60 

40 Gly Val Leu His Tyr Ser Met Val Leu Glu Gly Gly Asn Asp Ala Leu 

65 70 75 80 

Lys Leu Ala He Asp Asn Ala Leu Ser He Thr Ser Asp Gly Leu Thr 
85 90 95 

45 

He Arg Leu Glu Gly Gly Val Glu Pro Asn Lys Pro Val Arg Tyr Ser 
100 105 110 

Tyr Thr Arg Gin Ala Arg Gly Ser Trp Ser Leu Asn Trp Leu Val Pro 
50 115 120 125 



55 



70 



EP 0 541 335 A1 



lie Gly His Glu Lys Pro Ser Asn He Lys Val Phe lie His Glu Leu 
130 135 140 

Asn Ala Gly Asn Gin Leu Ser His Met Ser Pro lie Tyr Thr lie Glu 
145 150 155 . 160 



10 



Met Gly Asp Glu Leu Leu Ala Lys Leu Ala Arg Asp Ala Thr Phe Phe 

165 170 175 

Val Arg Ala His Glu Ser Asn Glu Met Gin Pro Thr Leu Ala He Ser 

180 185 190 



15 



His Ala Gly Val Ser Val Val Met Ala Gin Thr Gin Pro Arg Arg Glu 
195 200 205 

Lys Arg Trp Ser Glu Trp Ala Ser Gly Lys~Val Leu Cys Leu Leu Asp 

210 215 220 



20 



Pro Leu Asp Gly Val Tyr Asn Tyr Leu Ala Gin Gin Arg Cys Asn Leu 
225 230 235 240 



Asp Asp Thr Trp Glu Gly Lys lie Tyr Arg Val Leu Ala Gly Asn Pro 
245 250 255 



25 



Ala Lys His Asp Leu Asp He Lys Pro Thr Val He Ser His Arg Leu 
260 265 270 



30 



His Phe Pro Glu Gly Gly Ser Leu Ala Ala Leu Thr Ala His Gin Ala 

275 280 285 

Cys His Leu Pro Leu Glu Thr Phe Thr Arg His Arg Gin Pro Arg Gly 

290 295 300 



35 



Trp Glu Gin Leu Glu Gin Cys Gly Tyr Pro Val Gin Arg Leu Val Ala 
305 310 315 320 



Leu Tyr Leu Ala Ala Arg Leu Ser Trp Asn Gin Val Asp Gin Val lie 
325 330 335 



40 



Arg Asn Ala Leu Ala Ser Pro Gly Ser Gly Gly Asp Leu GTy Glu Ala 
340 345 350 



45 



He Arg Glu Gin Pro Glu Gin Ala Arg Leu Ala Leu Thr Leu Ala Ala 
355 360 365 

Ala Glu Ser Glu Arg Phe Val Arg Gin Gly Thr Gly Asn Asp Glu Ala 
370 375 380 
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Gly Ala Ala Asn Ala Asp Val Val Ser Leu Thr Cys Pro Val Ala Ala 
385 390 395 400 

Gly Glu Cys Ala Gly Pro Ala Asp Ser Gly Asp Ala Leu Leu Glu Arg 
405 410 415 

Asn Tyr Pro Thr Gly Ala Glu Phe Leu Gly Asp Gly Gly Asp Val Ser 
420 425 430 

Phe Ser Thr Arg Gly Met Gly Ala Arg Ala Ser Val Leu Ser Gly Gly 
435 440 445 

Glu Leu Asp Lys Trp Glu Lys lie Arg Leu Arg Pro Gly Gly Lys Lys 
450 455 460 

Gin Tyr Lys Leu Lys His lie Val Trp Ala Ser Arg Glu Leu Glu Arg 
465 470 475 480 

Phe Ala Val Asn Pro Gly Leu Leu Glu Thr Ser Glu Gly Cys Arg Gin 
485 490 495 

lie Leu Gly Gin Leu Gin Pro Ser Leu Gin Thr Gly Ser Glu Glu Leu 
500 505 510 

Arg Ser Leu Tyr Asn Thr lie Ala Val Leu Tyr Cys Val His Gin Arg 
515 520 525 

lie Asp Val Lys Asp Thr Lys Glu Ala Leu Asp Lys He Glu Glu Glu 
530 535 540 

Gln^Asn Lys Ser Lys Lys Lys Ala Gin Gin Ala Ala Ala Asp Thr Gly 
545 550 555 560 

Asn Asn Ser Gin Val Ser Gin Asn Tyr Pro He Val Gin Asn Leu Gin 
565 570 575 

Gly Gin Met Val His Gin Ala He Ser Pro Arg Thr Leu Asn Ala Trp 
580 585 590 

Val Lys Val Val Glu Glu Lys Ala Phe Ser Pro Glu Val He Pro Met 
595 600 605 

Phe Ser Ala Leu Ser Glu Gly Ala Thr Pro Gin Asp Leu Asn Thr Met 
610 615 620 



Leu Asn Thr Val Gly Gly His Gin Ala Ala Met Gin Met Leu Lys Glu 
625 630 635 640 
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Thr He Asn Glu Glu Ala Ala Glu Trp Asp Arg Leu His Pro Val His 
645 650 655 

Ala Gly Pro lie Ala Pro Gly Gin Met Arg Glu Pro Arg Gly Ser Asp 
660 665 670 

lie Ala Gly Thr Thr Ser Thr Leu Gin Glu Gin He Gly Trp Met Thr 
675 680 685 

His Asn Pro Pro lie Pro Val Gly Glu lie Tyr Lys Arg Trp lie He 
690 695 700 

Leu Gly Leu Asn Lys He Val Arg Met Tyr Ser Pro Thr Ser lie Leu 
705 710 715 y 720 

Asp He Arg Gin Gly Pro Lys Glu Pro Phe Arg Asp Tyr Val Asp Arg 
725 730 735 

Phe Tyr Lys Thr Leu Arg Ala Glu Gin Ala Ser Gin Glu Val Lys Asn 
740 745 750 

Trp Met Thr Glu Thr Leu Leu Val Gin Asn Ala Asn Pro Asp Cys Lys 
755 760 765 

Thr He Leu Lys Ala Leu Gly Pro Gly Ala Thr Leu Glu Glu Met Met 
770 775 780 

Thr Ala Cys Gin Gly Val Gly Gly Pro Gly His Lys Ala Arg Val Leu 
785 790 795 800 

Ala Glu Ala Met Ser Gin Val Thr Asn Pro Ala Thr He Met He Gin 
805 810 815 

Lys Gly Asn Phe Arg Asn Gin Arg Lys Thr Val Lys Cys Phe Asn Cys 
820 825 830 

Gly Lys Glu Gly His He Ala Lys Asn Cys Arg Ala Pro Arg Lys Lys 
835 840 845 

Gly Cys Trp Lys Cys Gly Lys Glu Gly His Gin Met Lys Asp Cys Thr 
850 855 860 

Glu Arg Gin Ala Asn Phe Leu Gly Lys He Trp Pro Ser His Lys Gly 
865 870 875 880 



Arg Pro Gly Asn Phe Leu Gin Ser Arg Pro Glu Pro Thr Ala Pro Pro 
885 890 895 
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Glu Glu Ser Phe Arg Phe Gly Glu Glu Thr Thr Thr Pro Ser Gin Lys 

900 905 910 

Gin Glu Pro lie Asp Lys Glu Leu Tyr Pro Leu Ala Ser Leu Arg Ser 
915 920 925 

Leu Phe Gly Ser Asp Pro Ser Ser Gin Xaa 
930 935 



Claims 

1. A recombinant DNA segment comprising a nucleotide sequence coding for a hybrid protein comprising a 
modified Pseudomonas exotoxin and a polypeptide that is exogenous to an antigen-presenting cell, said 
hybrid capable of being at least partially presented on an antigen-presenting cell surface. 

2. A recombinant DNA segment comprising a nucleotide sequence coding for a hybrid protein comprising a 
modified Pseudomonas exotoxin and a polypeptide of viral origin, said hybrid capable of being at least 
partially presented on an antigen-presenting cell surface. 

3. A recombinant DNA segment comprising a nucleotide sequence coding for a hybrid protein comprising a 
modified Pseudomonas exotoxin and a polypeptide of viral origin, said hybrid being capable of being in- 
ternalized by an antigen-presenting cell and further capable of being at least partially presented on the 
surface of Said antigen-presenting cell. 

4. A recombinant DNA segment comprising a nucleotide sequence coding for a hybrid protein comprising a 
modified Pseudomonas exotoxin and a polypeptide of viral origin, said hybrid capable of being internalized 
by an antigen-presenting cell and further capable of being processed for at least partial presentation on 
the surface of said antigen-presenting cell, sufficiently to elicit an immune response by cytotoxic T lym- 
phocytes. 

5. A transformant harboring a recombinant DNA segment comprising a nucleotide sequence coding for a hy- 
brid protein comprising a modified Pseudomonas exotoxin and a polypeptide that is exogenous to an an- 
tigen-presenting cell, said hybrid capable of eliciting an immune response by cytotoxic T lymphocytes. 

6. A transformant harboring a recombinant DNA segment comprising a nucleotide sequence coding for a hy- 
brid protein comprising a modified Pseudomonas exotoxin and a polypeptide that is exogenous to an an- 
tigen-presenting cell said hybrid capable of being at least partially presented on an antigen-presenting 
cell surface. 

7. A transformant harboring a recombinant DNA segment comprising a nucleotide sequence coding for a hy- 
brid protein comprising a modified Pseudomonas exotoxin and a polypeptide of viral origin, said hybrid 
capable of being at least partially presented on an antigen-presenting cell surface. 

8. A transformant harboring a recombinant DNA segment comprising a nucleotide sequence coding for a hy- 
brid protein comprising a modified Pseudomonas exotoxin and a polypeptide of viral origin, said hybrid 
capable of being internalized by an antigen-presenting cell, and further capable of being at least partially 
presented on the surface of said antigen-presenting cell. 

9. The recombinant DNA segment as claimed in any one of claims 1 to 4, wherein said modified Pseudomo- 
nas exotoxin lacks a functioning ADP ribosylating domain. 

10. The recombinant DNA segment as claimed in claim 2, wherein said polypeptide of viral origin is a viral 
protein fragment comprising the matrix protein of influenza A virus. 

11. The recombinant DNA segment as claimed in claim 10, wherein said viral protein fragment comprises re- 
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sidues 57 to 68 of the matrix protein of influenza A virus. 

12. The recombinant DNA segment as claimed in claim 2, wherein said polypeptide of viral origin is a viral 
protein fragment comprising the gag protein of human immunodeficiency virus-1. 

13. The recombinant DNA segment as claimed in claim 2, wherein said polypeptide of viral origin is a viral 
protein fragment comprising the nucleoprotein of influenza A virus. 

14. The transformant as claimed in claim 5 wherein said modified Pseudomonas exotoxin lacks a functioning 
ADP ribosylating domain. 

15. The transformant as claimed in claim 7, wherein said polypeptide of viral origin is a viral protein fragment 
comprising the viral matrix protein of influenza A virus. 

16. The transformant as claimed in claim 15 ? wherein said viral protein fragment comprises residues 57 to 68 
of the matrix protein of influenza A virus. 

17. The transformant as claimed in claim 7, wherein said polypeptide of viral origin is a viral protein fragment 
which is sufficiently specific to bind to HLA-2. 

18. The transformant as claimed in claim 7, wherein said polypeptide of viral origin is a viral protein fragment 
comprising the nucleoprotein of influenza A virus. 

19. The transformant as claimed in claim 7, wherein said polypeptide of viral origin is a viral protein fragment 
comprising the gag protein of human immunodeficiency virus-1. 
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